I 
; 
f 


Journal 
of Speech and Hearing 


Research 


September 1961 VOLUME 4 © NUMBER 3 


A Psychophysical Investigation of Vowel Formants 
GRANT FAIRBANKS AND PATTI GRUBB 


Dimensions of Language Performance in Aphasia 
LYLE V. JONES AND JOSEPH M. WEPMAN 


Bilabial Stop and Nasal Consonants: a Motion Picture Study 
and its Acoustical Implications 
OSAMU FUJIMURA 


Intelligibility of Slow-Played Speech 
WILLIAM R. TIFFANY AND DELMOND N. BENNETT 


Children’s Articulation and Sound Learning Ability 
HARRIS WINITZ AND MARTHA LAWRENCE 


Reliability of Conditioned GSR Pure-Tone Audiometry 
with Adult Males 


JOSEPH B. CHAIKLIN, IRA M. VENTRY, AND LYMAN S. BARRETT 


Identification of Stuttering during Relatively Fluent Speech 
RONALD W. WENDAHL AND JANE COLE 


Intellectual Impairment in Children with Cleft Palates 
LEONARD D. GOODSTEIN 


Letters to the Editor 
HILDRED SCHUELL AND JAMES J. JENKINS 


Research News Notes 








American 
Speech and Hearing 
Association 


OFFICERS 


President 
G. Paul Moore, Ph.D. 
Northwestern University 


Executive Vice-President 
Jack Matthews, Ph.D. 
University of Pittsburgh 
Vice-President 


Duane C. Spriestersbach, Ph.D. 
University of Iowa 


OFFICERS-ELECT 


President-Elect 
James F. Curtis, Ph.D. 
University of Iowa 


Vice-President-Elect 
Elise S. Hahn, Ph.D. 
University of California 
Los Angeles 


COUNCIL 

The Officers and the following 
Councilors: 

Stanley Ainsworth, Ph.D. (1961) 
Oliver Bloodstein, Ph.D. (1960-62) 
William G. Hardy, Ph.D. (1960-62) 
Ruth B. Irwin, Ph.D. (1960-63) 
James Jerger, Ph.D. (1959-61) 
Wendell Johnson, Ph.D. (1959-62) 
Hayes A. Newby, Ph.D. (1960-63) 
Mildred Templin, Ph.D. (1961-63) 
Joseph M. Wepman, Ph.D. (1961-63) 
Dean E. Williams, Ph.D. (1959-61) 


EXECUTIVE SECRETARY 
Kenneth O. Johnson, Ph.D. 





Journal 
of Speech and Hearing 
Research 


EDITOR 
Dorothy Sherman, Ph.D. 


ASSISTANT TO THE EDITOR 
Dorothy W. Moeller 


STATISTICAL CONSULTANT 
Leonard S. Feldt, Ph.D. 


ASSOCIATE EDITORS 
Oliver Bloodstein, Ph.D. 
Arthur S. House, Ph.D. 
James Jerger, Ph.D. 

D. E. Morley, Ph.D. 
Hildred Schuell, Ph.D. 
Arnold M. Small, Ph.D. 
William R. Tiffany, Ph.D. 
John C. Webster, Ph.D. 
Joseph M. Wepman, Ph.D. 
Herbert N. Wright, Ph.D. 


ASSISTANT EDITORS 


. Kenneth L. Moll, Ph.D. 


Martin A. Young, Ph.D. 


SPECIAL EDITORS 
Ernest H. Henrikson, Ph.D. 
Book Reviews 


Martin F. Palmer, Sc.D. 
Records 


Elizabeth Moodie Prather, Ph.D. 
Interlingua 


BUSINESS MANAGER 
Kenneth O. Johnson, Ph.D. 


ASHA PUBLICATIONS BOARD 


Virgil A. Anderson (1959-61); John Black (1961-63); Ernest J. Burgi, Representative, House 
of Delegates, Raymond Carhart (1960-62); Frederic Darley, Editor, Monographs (1960-62); 
Mary Huber, Editor, Journal of Speech and Hearing Disorders (1959-62); Kenneth O. Johnson, 
Business Manager of Publications and Editor, Asha; Dorothy Sherman, Editor, Journal of 
Speech and Hearing Research (1959-62); and Wendell Johnson, Chairman (1959-62). 





Sey at eh et © ot hy 


co7 = | © 

















A Psychophysical Investigation of Vowel Formants 


GRANT FAIRBANKS 


PATTI GRUBB 


Although acoustic vowels are specified 
by combinations of formant frequen- 
cies, it is commonly understood that 
these frequencies vary considerably 
from utterance to utterance. The in- 
vestigations of such variations have 
provided useful information about 
individual differences in speech and 
about the range of vowel approxi- 
mations in the speech attempts that a 
listener or a voice-operated device must 
be prepared to accept. The experiment 
reported here, however, has proceeded 
in a different direction. It has taken for 
its main purpose the study of the form- 
ant structure of vowel samples that 
meet high standards of identifiability 
and judged representativeness under 
controlled laboratory conditions. 


Procedure 


Collection of Samples. Vowels select- 
ed for study were /ite eaaaduv/. 
It will be recognized that they are ex- 
emplary of the complete range of Eng- 
lish vowels, that all occur as vowels in 
General American speech, and that each 





Grant Fairbanks (Ph.D., University of 
Iowa, 1936) is Professor of Speech, Uni- 
versity of Illinois. Patti Grubb (Ph.D., Uni- 
versity of Illinois, 1956) is Research Associate, 
Laboratory of Neurological Research, College 
of Medical Evangelists, Los Angeles. The 
article is based on the Ph.D. dissertation of 
Patti Grubb; the investigation was supported 
by the Research Board of the University of 
Illinois. 


Volume 4, No. 3 


may be produced without ambiguity in 
the steady state. The speakers who fur- 
nished samples were seven men, profes- 
sors from the Department of Speech at 
the University of Illinois, ranging in age 
from 34 to 57 years with a median age 
of 43 years. All had been teachers for 
many years, were experienced speakers 
and habitual users of the General 
American dialect. About one week be- 
fore formal procedure in the laboratory 
a copy of the following statement was 
given to each speaker and discussed with 
him. It is quoted in full because it ex- 


plains the general rationale of the prob- 
lem. 


Explanation of the Problem for Speakers 


This experiment is concerned with nine 
vowels of the General American dialect 
and we would like to ask you to produce 
examples of each of these vowels. For 
certain reasons we want these examples 
produced in particular ways, and the 
entire success of the experiment depends 
upon your being able to do this. The job 
will not be easy, and it is for this reason 
that we are using as subjects only people 
such as yourself who have studied speech 
for many years and who have exceptional 
personal vocal skills. 

Essentially what we are trying to do is 
to collect samples of each vowel that are 
as nearly typical or representative of that 
vowel as possible. More specifically, we 
are interested in samples that depict the 
central tendency of each vowel. In order 
to make this as clear as possible, let us 
imagine that we have collected a large 
number of different words as spoken by 
a large number of General American 
speakers, each word including an example 


203 September 1961 





204 Journal of Speech and Hearing Research 


of the vowel in question, and that we have 
tape recordings of these words. Suppose 
that we now cut out a short portion of 
the vowel from each word, mix the 
portions up in random order together with 
similar samples from the other vowels, and 
play them one at a time for a group of 
observers who have been instructed to 
identify each vowel. Now let us take out 
into a separate group those examples of 
the one particular vowel that we are 
talking about that have been identified by 
100 per cent of the observers. If we play 
these and listen to them we will discover 
that they are similar to each other but 
not identical. They vary over a certain 
range. It is the center point of that range 
that we are interested in, and you can 
regard that point as a target to shoot at 
as you produce your examples. Another 
way of putting the problem is to say what 
we want you to do is to imagine the target 
on the basis of your experience in listen- 
ing to speech, and then demonstrate what 
the target is by producing a vowel of 
your own that hits the target as you 
imagine it. 

You will understand from this that we 
are trying to get samples that are some- 
thing more than merely acceptable and 
identifiable. It will also be clear that we 
are not asking you to produce samples of 
the way you personally, individually pro- 
duce any given vowel, but rather to think 
about the entire population of General 
American speakers. Further, we are not 
asking you to exemplify the way in which 
anyone thinks the vowels should be pro- 
duced in the sense of ‘standards’ or any- 
thing of the sort. Instead we want the 
central tendency as you hear it. 


Each speaker came to the laboratory 
individually and the complete procedure 
for collection of samples was accom- 
plished with him in a single session. 
The following instructions were read 
aloud formally. Since they will be relied 
upon for exposition of some of the pro- 
cedure, attention is invited to their de- 
tails. 


Instructions to Speakers 


Do you have any questions about the 
written explanation that I gave you earlier? 
Would you mind summarizing it in your 
own words so that we can make sure that 
we understand each other? 


Here is a list of the nine vowels that 
we are going to work with, and a few 
typical examples of words in which these 
vowels occur in GA. Don’t pay any 
special attention to the words. They are 
there simply to help identify the vowel 
further; it would defeat our purpose if 
you were to fix on the vowel of any 
particular word as a model. 

We are going to consider the vowels 
one at a time, recording as we go. For 
each vowel we will record and save two 
examples that satisfy you and then go on 
to the next vowel. After we have finished 
all the vowels we will listen once more to 
the examples that we have saved. 

While you are producing the vowels 
please sit here in this chair with the ear- 
phones on and your mouth about one 
foot from the microphone. Like this. The 
course of the procedure will be controlled 
by you and we will stay with each vowel 
until you are satisfied. A flash of this red 
light will signify that the apparatus is 
ready. As soon after as you like you can 
make your first attempt and I will be re- 
cording a section of it as you do so. A 
second flash of the light will indicate 
recording is over and you will stop phonat- 
ing. As soon as you stop you will hear 
the section played back to you over and 
over through the earphones. Listen to this 
as many times as you like. Try to decide 
if it is what you are trying for or not. 
Ask yourself if it is as good as you can 
do or if you can do better. If you are 
satisfied, say ‘Keep’ and go on to the 
next attempt. If not, say ‘Discard’ and 
repeat the attempt. In either case wait for 
the red light before going on. Please be 
very critical of what you produce and do 
not accept any example unless you are 
entirely satisfied that it depicts the central 
tendency of the vowel as you hear it in 
the language. 

You will notice that the sections that 
are recorded are short. Therefore, each 
vowel that you produce will need to be 
sustained for only a second or two. How- 
ever, use any duration longer than that 
which you find comfortable or otherwise 
desirable. 

As far as loudness is concerned, keep 
your various attempts approximately the 
same, but don’t make any special effort to 
make them exactly equal. 

We would like you to produce all the 
vowels at as nearly the same pitch as pos- 
sible, matching a standard pitch that I will 
give you from time to time in the ear- 








Fb} 











Fairbanks, Grubb: 


phones. If you want this standard at any 
other time before an attempt, please ask 
for it. I will be able to hear you over the 
microphone. 


The recording referred to above was 
accomplished on a system consisting of 
an Altec 21-B microphone and a Mag- 
necorder PT-6 tape recorder operated 
at 15 ips. A duplicate system, exclusive 
of tape transport, was used for monitor- 
ing by the speaker, who wore a binaural 
(parallel) headset with Permoflux PDR- 
10 earphones over which he heard his 
speech contemporaneously with pro- 
duction and received instructions. As 
will appear, he also heard and evaluated 
his recorded product over the same sys- 
tem. The level of reproduction was 
adjusted to his preference during pre- 
liminary trial. The pitch standard 
referred to in the instructions was sup- 
plied at 130 cps by an oscillator. It was 
introduced into the earphones at the 
start of each series of vowel attempts 
and at such other times as requested by 
the speaker, but matching was not 
rigidly enforced. The equipment was 
arranged in a two-room laboratory suit- 
able for this type of work. 

Recording was done on loops pre- 
viously prepared, one for each sample. 
Each loop was 30” long and thus cycled 
in about two seconds. The iron oxide 
coating was scraped from all but 4.5” 
of the loop, leaving a ‘live’ section pro- 
portional to 0.3 sec. The signal light 
mentioned above was controlled manu- 
ally by the experimenter to indicate to 
the speaker the general period during 
which the ‘live’ section of tape passed 
the record head. The arrangement per- 
mitted the experimenter to obtain an 
isolated 0.3-sec sample from the middle 
of a longer sustained sample. As soon 
as each sample had been recorded and 
approved by the speaker, according to 


Investigation of Vowel Formants 205 


the procedure outlined in the instruc- 
tions, the process was repeated with the 
next vowel, and so on until all 18 sam- 
ples, two for each of nine vowels, had 
been collected. The typical individual 
sample was preceded by a number of 
attempts before final acceptance by the 
speaker, and no speaker appeared to find 
the task an easy one. 

As the two samples of each vowel 
were secured the two recording loops 
were spliced together into a larger loop 
that displayed the pair of samples in the 
order of recording with a pause of 
about 0.8 sec between. After all sam- 
ples had been recorded the speaker 
rested briefly and then listened to these 
pairs of samples one at a time over the 
identical system used in reproducing 
the samples earlier. He was asked to 
pick that one of the two which he con- 
sidered the more successful, and then 
invited to attempt still another sample 
if he believed he could produce an even 
more successful sample. In such a case 
the third sample was spliced into the 
loop following the second sample, and 
the speaker was asked to make first 
and second choices from the three. He 
was then invited to try yet another, 
continuing as long as he wished. No 
speaker went beyond the third sample. 

The ultimate yield of the procedure 
was a set of 126 samples for analysis, 
each of seven speakers having produced 
a pair of samples of each of nine vowels, 
each sample had been approved by the 
speaker after an unspecified number of 
attempts, and the speaker had ranked 
the two members of each pair. 

Judgmental Procedures. In prepara- 
tion for this portion of the experiment 
the original samples were re-recorded 
on equipment equivalent to the original. 
In the process the samples were roughly 





206 Journal of Speech and Hearing Research 


equated in level, within a total range of 
five decibels. They were ordered at 
random and spliced into a continuous 
stimulus tape, each sample being pre- 
ceded by a spoken item number and 
followed by a 4-sec judgment interval. 
The listening situation essentially dupli- 
cated that used in the arrangement for 
collecting samples, except that four 
matched headsets, equivalent to that 
used by the speakers, were paralleled to 
accommodate four observers at a time, 
seated facing away from each other in 
the quiet room of the laboratory. Level 
of reproduction was approximately 65 
db above threshold. 

The observers were eight young 
adults, six of them men, trained in 
phonetic analysis at the graduate level, 
and experienced as experimental ob- 
servers. All spoke the General American 
dialect natively and habitually. None 
had a history of hearing loss. 

Two types of procedure were used, 
the first of which was attempted identi- 
fication. For this part a response sheet 
was used which listed phonetic symbols 
and key words for the nine vowels 
across the top and item numbers down 
the left, with rows of cells so that re- 
sponse could be made by checking in 
the appropriate column. The observers 
were familiarized with the general na- 
ture of the stimuli and with the use of 
the response sheet by means of a short 
practice tape consisting of vowel sam- 
ples similar to the stimuli. 

The second judgmental situation was 
designed to provide an estimate of the 
degree to which each sample was repre- 
sentative of the vowel intended, and 
was carried on at later, separate sessions 
with the same observers. The same 
stimulus tape and general conditions of 
administration were used. The response 


sheet for this session showed each at- 
tempted vowel by means of a phonetic 
symbol after the item number, while 
across the top were shown equidistant 
points along a nine-point graphic rating 
scale. The points were numbered / 
through 9 from left to right; above the 
respective numbers were the words In- 
ferior, Very Poor, Poor, Below Aver- 
age, Average, Above Average, Good, 
Very Good, Superior. Each item had a 
corresponding row of nine spaces and 
rating was recorded by checking. In 
addition to the response sheet each ob- 
server had a card on which the nine 
vowel symbols were identified by code 
numbers. Prior to each item the experi- 
menter spoke the code number of the 
intended vowel, and this, together with 
the phonetic symbol after the item num- 
ber, was the means of announcing the 
intended vowel without articulating it. 
The tape was stopped for judgment 
after each sample was played. The ob- 
servers were invited to use as much 
time as needed and to ‘judge each sam- 
ple in terms of the success with which 
it represents the attempted vowel as 
that vowel most frequently occurs in 
General American speech.’ Before the 
experiment a brief practice tape was 
played which consisted of about 20 
samples from the center of the stimulus 
tape. These included at least one of each 
vowel and, in the opinion of the experi- 
menters, at least one instance of both 
extremes of representativeness so that 
the span of the rating scale might thus 
be illustrated. 

Spectrographic Analysis. A Kay Elec- 
tric Company Sona-Graph, _ locally 
modified to provide a full-scale range 
of 3 500 cps, was used. The original tape 
loops were reproduced on the original 
recording equipment, and the first step 








QO ww & 4@ fd 


=) 

















Fairbanks, Grubb: Investigation of Vowel Formants 207 
Taste 1. Identification matrix. 
Intended Identified as 
Vowel i I S ze A a 2 U u 
i 102 8 1 1 
I 2 59 31 1 1 18 
€ 1 82 15 13 1 
z 14 72 22 3 1 
A 76 16 9 11 
a 30 68 12 2 
A) 1 27 83 1 
U 11 ?. 97 2 
u 4 1 4 103 
Total 108 68 127 87 154 114 108 136 106 








with each sample was to make a con- 
ventional spectrogram with the 300-cps 
bandwidth and flat frequency response. 
This was done to verify the essential 
constancy of the sample and facilitate 
selection of the portion to be analyzed 
by means of the sectioner. The section 
was made midway in the sample; the 
45-cps bandwidth was used together 
with a frequency response of positive 
slope. The procedure for sectioning 
was to make the first section at the 
maximum level which would avoid 
amplitude clipping of the strongest 
formant, usually the first, and then to 
resection as many times as necessary at 


the same point, progressively increasing 
the amplitude of the display until the 
weakest formant, usually the third, had 
been resolved. Decisions in such respects 
were guided by the expectation of three 
formants at certain general locations, 
according to previous results. In over 
half the cases the first section proved to 
be adequate; two sufficed in most of 
the others, but in some samples it was 
necessary to make three. This practice 
was conservative and multiple sections 
were made in all cases that were not 
completely unequivocal. 

The frequency scale was calibrated 
individually for each vowel section by 


Taste 2. Cumulative distributions of identification scores. 











Intended Identification Score (Number Correct) 
Vowel 0 1 2 3 4 5 6 7 8 
i 14 13 13 13 12 9 
I 14 12 11 11 11 8 4 2 
€ 14 13 12 12 11 9 6 5 
e 14 12 12 11 10 9 8 6 4 
A 14 13 13 12 12 10 8 6 Zz 
a 14 11 11 11 10 7 Ys 7 4 
a) 14 12 11 11 10 7 4 
U 14 13 11 10 7 
u 14 13 13 13 13 13 10 
Total 126 118 116 110 106 95 83 69 45 











208 Journal of Speech and Hearing Research 


sectioning the output of a generator (3) 
which provided a complex test tone 
having the harmonics of 200 cps. This 
frequency was derived from an oscilla- 
tor and calibrated by Lissajous figure 
against a highly stable 1 000-cps stand- 
ard. Although the drift of the frequency 
scale of the spectrograph became meas- 
urable only over periods of several 
hours, it is believed that the labor in- 
volved in this method of continuous re- 
calibration was repaid by increased pre- 
cision. 

Before any frequency measurements 
of formants were made a jury of three 
persons, all sophisticated in spectro- 
graphic analysis, agreed upon the gen- 
eral locations of major energy regions 
and identified the three formants to be 
measured. In a few cases where unanim- 
ity failed the necessary decisions were 
reached in consultation between the jury 
and a fourth expert.* Consideration was 
given to some method of amplitude 
weighting in the frequency measure- 
ment of formants, such as that used by 
Potter and Steinberg (5). However, it 
was not proposed to make any infer- 
ences with respect to pitch, the theory 
of weights is not compelling in any case, 
and the method is extremely laborious. 
It was decided that it would be prefer- 
able to make direct measurements at 
points of maximum amplitude as re- 
corded on the section, with due regard 
to the shaping characteristics of the 


"These procedures were independent of the 
auditory ao of course, but when the 
two types of data were brought together it 
was interesting to note a strong impression 
that auditory identifiability of a sample by 
listeners is associated with visual identif ability 
of the formants in a spectrum. This was 
especially true of the preferred samples (see 
below), in which the three formants were 
almost invariably prominent and distinct. 


system. In most instances one compo- 
nent was obviously most prominent, but 
two adjacent components were equal 
in amplitude in some formants, and here 
the arithmetic mean was used.? 


Results 

Identifications and Ratings. Table 1 
summarizes the results of the identifica- 
tion procedure. Each row shows the dis- 
tribution of 112 judgments, eight ob- 
servers times 14 samples. The entries 
along the central diagonal may be in- 
terpreted as signifying ‘correct’ identi- 
fications, in the sense that they indicate 
agreements between what the listener 
heard and what the speaker intended. It 
is plain that the samples were charac- 
terized by generally high identifiability. 
The over-all figure is 74%; individual 
vowels range from 53 to 92%. On the 
assumption that the speakers made equal 
effort across the vowels, it seems likely 
that the variations along the diagonal 
signify differences in ease of vowel pro- 
duction. The most readily identified 
samples were those of /i/ and /u/, 
which might be expected from the fact 
that they represent poles of articulatory 
position and formant combination. 

Table 2 is based on the identification 
scores of individual samples, that is, 
number correct, the maximum score 


*Although the study of nondistinctive 
energy regions was not an objective of the 
experiment, certain observations may be of 
interest. In most of the samples a concentra- 
tion was encountered in the 3000-3 500 cps 
range, a region that a number of previous 
investigators have remarked. Additional re- 
gions at lower frequencies, apparently char- 
acteristic of individual voices, were also 
found. Such regions tended to be reasonably 
constant from sample to sample within 
speaker, but to differ in location from speaker 
to speaker. For instance, one speaker exhibited 
two regions of this kind in most of his 
samples, one at about 1250 cps and the 
other at about 1650 cps. 











tior 
dur 
sho 
fici 
ful 
bas: 
fro 
feat 
Bar 
in 

ies 








Fairbanks, Grubb: Investigation of Vowel Formants 209 


Taste 3. Cumulative distributions of median ratings. 











Intended Median Rating (1-to-9 Scale) 
Vowel 1 Z 3 2 5 6 ds 8 
i 14 13 10 8 5 
I 14 11 9 6 5 2 
& 14 13 13 12 9 5 2 1 
ro 14 12 12 11 9 6 4 1 
A 14 13 13 11 10 6 
a 14 12 11 11 11 7 4 1 
fe) 14 13 9 4 3 1 
U 14 13 11 9 5 4 2 
u 14 13 12 11 7 3 Z 
Total 126 121 114 106 88 59 36 13 








being eight. A cumulative distribution 
is shown for each vowel, and a given 
entry is to be interpreted as showing the 
number of samples having the indicated 
score or better. For instance, 11 of the 
14 samples of /e/ had identification 
scores of five or higher. The combined 
distribution is shown along the bottom. 
It will be noted that 106 of the 126 
samples were identified by four or 
more observers, 83 by six or more, and 
45 by all eight. Special interest attaches 
to the well-identified samples and in 
particular, as will be seen, to those with 
scores of six and higher. The samples 
that were less frequently identified are 
useful, however, for comparative pur- 
poses, and Table 2 shows that the range 
of identifiability was large enough to 
provide this kind of contrast.* 


‘It should not be overlooked that identifica- 
tion here was of samples essentially alike in 
duration, level, and fundamental Frecsienvy, 
shorn of any assimilation cues. High speci- 
ficity of spectrum was required of a success- 
ful sample, since this was the listener’s sole 
basis. This is obviously a different matter 
from identification of a word, such as was a 
feature of the procedure of Peterson and 
Barney (3). Although word detection is based 
in part on the vowel’s spectrum, wide latitude 
therein may be offset by the other cues that 
are available. 


The results of the rating procedure 
were used to derive for each sample a 
median rating on the 1-to-9 scale, in- 
terpreted as indicating the degree to 
which the sample approached the cen- 
tral tendency of the intended vowel, 
that is, the extent of its representative- 
ness. Table 3 is concerned with these 
data and shows cumulative distributions. 
The general level of rating may be ob- 
served to be high, with 88 samples 
(70%) having medians at the midpoint 
of the scale or higher, yet the range is 
wide enough for discrimination. As 
would be expected the relationship be- 
tween identification scores and median 
ratings was substantial. For the 126 
samples the product-moment correla- 
tion coefficient was 0.76 between the 
two. At the upper end of the distribu- 
tion of ratings, the region of greatest 
interest for this study, it was notable 
that identification was uniformly high. 
For instance, the highest interval of 
Table 3 contains 13 samples; 12 of these 
were identified by all eight observers 
and the remaining sample by seven. 
However, at the upper end of the dis- 
tribution of identification scores the 
range of median ratings was wide. The 
45 samples identified by all observers 





210 Journal of Speech and Hearing Research 


(Table 2, last column) were distributed 
continuously from 8.5 down to 3.5 in 
median rating. That is, in representa- 
tiveness some of these easily identified 
samples were considered to be Very 
Good to Superior, while others were 
Below Average to Poor. This relation- 
ship might be summarized by suggesting 
that identifiability as a condition for 
representativeness appears to be neces- 
sary but not sufficient. 

It will be remembered that each 
speaker approved his own product, and 
that in many instances a sample received 
approval only after a string of attempts. 
All 14 samples of a given vowel, there- 
fore, may be regarded as having been 
screened by experts from a larger num- 
ber of vowel samples. Since they met 
such a criterion of acceptability they 
will be referred to as self-approved 
samples. 

Data from the identification pro- 
cedure were employed to form an arbi- 
trary category of identified samples, as 
they will be termed, consisting of those 
samples correctly identified by 75% or 


more of the observers. In Table 2 these 
are the 83 samples tabulated in the 
column under the heading 6, where it 
will be noted that the number varied 
from four to 13 among the individual 
vowels. 

Within the group of identified sam- 
ples of each vowel a third set, preferred 
samples, was defined. The intention 
here was to pick the most representative 
samples from among the most readily 
identified samples. From each set of 
identified samples of a vowel the four 
with the highest median ratings were 
selected. This number was also arbi- 
trary, constituting those above the 75th 
percentile of the distribution of self- 
approved samples. Ties in the fourth 
rank occurred with /i/ and /e/, and 
were resolved by including five samples 
in these cases only. In /1/ only four 
identified samples were available by the 
criterion for that category, so in this 
vowel the preferred samples included 
all identified samples. 


Formants One and Two. Table 4 is 
devoted to mean frequencies and is 


Taste 4. Mean values of F,, F:, and Fs (cps) for self-approved, identified, and preferred 


samples. 








Intended Vowel 
ze A a h) U u 





i I € 
Formant One 
Self-Approved 267 426 530 
Identified 264 387 504 
Preferred 263 387 493 


Formant Two 
Self-Approved 2251 1914 1691 
Identified 2284 2038 1 678 
Preferred 2378 2038 1660 


Formant Three 
Self-Approved 2974 2501 2520 
Identified 2991 2 591 2 430 
Preferred 3099 2591 2444 


660 599 680 612 419 276 
700 588 743 592 418 272 
733 588 775 600 392 279 


1569 1176 1096 788 =1: 136 840 
1606 1187 1083 690 1124 806 
1654 1199 1064 846 1122 825 


2464 2576 2614 2664 2435 2517 
2468 2591 2692 2615 2388 2518 
2510 2623 2614 2636 2500 2496 



































Fairbanks, Grubb: Investigation of Vowel Formants 


3K 


2K 





S00 
200 250 





yd 


PREFERRED 
IDENTIFIED 
SELF-APPROVED 


I 


900 IK 


: 
Figure 1, Frequency areas of Formants One and Two for self-approved, identified, and pre- 


ferred samples of vowels. Values in cps. 


presented mainly for reference. As ex- 
plained above, each set of self-approved 
samples consisted of the 14 offered by 
the speakers, the sets of identified 
samples varied in number, and the sets 
of preferred samples numbered four or 
five. 

The major findings are displayed in 
Figure 1, where F, and F,, the fre- 
quencies of the lowest two formants, 
are shown along abscissa and ordinate, 


respectively, the scale units being equal. 
This general arrangement for showing 
the combinations of F, and F, will be 
familiar from previous reports, although 
it will be noted that here both co- 
ordinates are logarithmic throughout.‘ 





“Thus Figure 1 may not be exactly com- 
pared to linear plots, or to those using the 
mel scale of Stevens, Volkmann, and Newman 
(6) or the pitch approximation scale of 
Koenig (2). 





212 Journal of Speech and Hearing Research 


For each vowel the self-approved area 
was formed by plotting the 14 samples 
as individual points (not shown) and 
connecting the extreme points by 
straight lines, reflex angles being 
avoided. The general locations of the 
areas are conventional, closely resem- 
bling, for example, those shown by 
Peterson and Barney (3). The areas are 
smaller, however, than are usually ob- 
tained for a group of speakers, with 
correspondingly less overlapping, and 
presumably this is attributable to the 
use of expert subjects. 


It will be noted in Figure 1 that the 
lower slanting edge of the /3/ area is 
along the line where F, = F,. The six 
samples produced by three speakers fall 
on this line. The original spectrographic 
sections of these samples clearly showed 
a single concentration of energy in the 
lower frequencies, with maximum 
amplitude either in one prominent com- 
ponent or in two adjacent components 
of equal amplitude. These were in con- 
trast to the eight remaining samples of 
/9/, each of which showed two definite 
formants. The possibility was con- 
sidered that the points of sectioning 
might have been unrepresentative in 
spite of the fact that their locations had 
been determined by means of the pre- 
liminary conventional spectrograms. 
Consequently, each sample was sec- 
tioned periodically at seven points, 0.04 
sec apart. The original finding was con- 
firmed in every case. It would appear 
that in these six one-formant samples of 
/o/ F, and F, were identical, or at 
least so close that two formants were 
not differentiated by the available com- 
ponents. It is notable that all six were 
among the identified samples, which 
totalled 10 in the case of this vowel. 


Since this close proximity of F, and F, 
was found only in /3/, this vowel is 
apparently the closest approach to a 
vowel having only one distinctive 
formant, that is, two resonators similar- 
ly tuned, and may well be the limiting 
case.° 


The identified areas in Figure 1. sur- 
round the points corresponding to 
those samples identified by 75% or 
more of the observers. It will be ob- 
served that the effects of imposing this 
criterion are that the area for each 
vowel has been reduced in size and that 
the nine areas are now mutually exclu- 
sive. This suggests strongly that when 
a vowel attempt is successful in the 
sense of identifiability, that is, is es- 
sentially free of ambiguity as a sample 
of the vowel in question, F, and F, 
may be sufficient to specify the vowel. 
Reciprocally, the finding suggests that 
fulfillment of F, and F, requirements in 
combination would be likely to foster 
successful identification of a vowel so 
constituted. In other words, the data 
support the idea that the identifiability 
of an uttered vowel depends in part 
upon the degree to which F, and F, 
approach the standards for that vowel. 
This is by no means an original view of 
the matter, but in support of it the 
past evidence drawn from live speech 
has been meager.® 


‘Crandall (1) many years ago also failed 
to resolve two formants for the same vowel 
as produced by adult speakers. 


‘It is interesting to note that imposition of 
a criterion of 100% identifiability did not 
result in any such large restriction of areas 
in the experiment of Peterson and Barney 
(3). As has already been mentioned, however, 
in that experiment identification was essen- 
tially word detection, and high word detec- 
tion does not necessarily mean that the vowel 
will be identifiable in isolation. 














Fairbanks, Grubb: 


Study of Figure 1 will show that in 
eight of the nine vowels, all but /u/, 
the identified area is located near the 
periphery of the self-approved area. 
These locations vary from vowel to 
vowel in an interesting manner. For 
instance, to consider only the F, dimen- 
sion, in /1/ and /e/ they are found 
among the lower values, in /a/ among 
the higher values, in /u/ among the 
medium values, etc. If such relative lo- 
cations within the larger areas are 
studied in connection with the identi- 
fication matrix of Table 1, it will be 
seen that in every vowel the location 
tends to be away from the identified 
area of that vowel for which mis- 
identified samples were most often mis- 
taken. This appears to be an important 
finding and it may be illustrated by 
reference to the general findings in the 
case of two vowels. Table 1 shows that 
self-approved samples of /3/ were mis- 
identified 29 times of which 27 were 
as /a/. Similarly, 42 of the 44 mis- 
identifications of the purported /a/ 
samples were distributed between /a/ 
and /2/. In Figure 1 the relationships 
between the self-approved and identi- 
fied areas for /9/ and /a/ are obviously 
correlated with these misidentifications. 
Thus, when the criterion of identifi- 
ability was applied the vowel areas not 
only became smaller and mutually ex- 
clusive, but also were plausibly specific. 

According to Table 1 the vowel most 
often used in judgments was /a/. Of. 
the 154 usages, 76 were correct in the 
sense that they were applied to self- 
approved samples of /a/; 63 of the 
remaining usages were applied to al- 
leged samples of /z/, /a/ or /u/, and 
in each of the three /a/ was the second 
most frequent judgment. If Figure 1 
is examined with this distribution in 


Investigation of Vowel Formants 213 


mind and studied with regard to the 
differences between the original self- 
approved areas and the identified areas 
of the four vowels, and their relative 
locations, it becomes apparent that por- 
tions of the self-approved areas of /zx/, 
/a/ and /u/ outside of their respective 
identified areas approach the identified 
area of /a/, with /a/, in fact, en- 
croaching upon it. In short, the ‘incor- 
rect’ judgments of such vowel samples 
were not incorrect in terms of F, and 
F,. As a whole this evidence supports 
the general conclusion that when cues 
for detection other than those residing 
in the spectrum are held to a minimum, 
as they were in the present experiment, 
vowel identifiability is dependent upon 
requirements for the combination of F, 
and F, that are comparatively rigid. 
The identified areas of Figure 1 af- 
ford information that bears on the 
absolute versus relative vowel theories. 
In review, the essential position of the 
absolute theory is that a vowel is 
characterized by a unique combination 
of formant frequencies, in which combi- 
nation the absolute frequencies F, and 
F,, not the relation between them, are 
the important data; the relative theory 
proposes that the vowel is specified by 
the ratio of frequencies, F,/F,, not by 
their absolute values. Thus in the rela- 
tive theory the formant frequencies 
may vary within vowel, given only that 
their ratio remain essentially constant. 
In terms of Figure 1 the discreteness 
of the nine identified areas supports the 
absolute theory, but does not as such 
confute the relative theory. However, 
the figure permits a graphic test of 
the relative theory. Since its coordinates 
are logarithmic and of equal scale, the 
ratio F,/F, is constant along any 
straight line which ascends to the right 








214 Journal of Speech and Hearing Research 


at an angle of 45° with the abscissa. 
The relative theory is not valid where 
such a line is common to more than one 
vowel, and this is seen to be the case. 
For example, the line for the ratio 2.5, 
which originates in the lower left-hand 
corner with 500/200, rises through the 
identified areas for /u/, /u/ and /z/. 
When the smaller preferred areas (see 
below) are considered it will be noted 
that /u/ and /u/, /o/ and /a/, /a/ 
and /x/ are pairs of vowels with 
similar F,/F,. However, Figure 1 shows 
plainly that neither F, nor F, alone is 
a complete specification; for example, 
600 cps for F, is common to three 
identified areas, the vowels being dif- 
ferentiated by F,. The importance of 
absolute location is emphasized at sever- 
al places in Figure 1, but perhaps most 
plainly by the compactness of the 
measurements of F, in /u/ and /i/. 
Each of these vowels had 13 samples 
in the identified group (Table 2). The 
values of F, ranged, respectively, from 
233 to 300 and from 217 to 283 cps, 
suggesting close adherence to a standard 
frequency location. In summary, the 
conclusions to be reached are that the 
data are positive support for an absolute 
theory and demonstrate that the relative 
theory is not tenable as a complete 
explanation.’ 


In evaluating this evidence it should not 
be overlooked that the present subjects were 
exclusively men. As indicated by earlier in- 
vestigations, notably Potter and Steinberg (5) 
and Peterson and Barney (3), the relative 
theory may apply to vowels in a different 
sense, namely, that the complete systems of 
women and children are displaced upward 


roughly along an equal ratio line in a plot . 


of F; and F:, presumably because of smaller 
physiological resonators. So interpreted the 
relative theory may indeed be applicable as 
a reference to the preservation of relations 
between formant combinations in subgroup 
or even individual systems of vowels. 


Attention is now directed to the 
preferred areas in Figure 1, where the 
samples have been restricted to those 
which the observers considered most 
representative, according to criteria ex- 
plained above. In every vowel the pre- 
ferred area is seen to be very much 
smaller than the self-approved area and, 
in most cases, to be considerably smaller 
than the identified area. The most ex- 
treme instance of the latter relationship 
is /u/; the main exception is /1/, where 
it will be recalled that the identified and 
preferred samples were identical. Evi- 
dently within-identification preference 
is correlated with restriction of both 
F, and F,. It was noted above that most 
of the identified areas tend toward ex- 
treme locations within their respective 
self-approved areas. The locations of 
some of the preferred areas are even 
more extreme in certain vowels, notably 
/e/, /e/ and /a/; in others they are 
not. For instance, in /u/ identification 
seems to depend upon low values of 
both formants, but the extreme combi- 
nations were not considered to be most 
representative. The case of /2/ is in- 
teresting and instructive. As shown in 
Figure 1, the criterion of identified 
samples eliminated the four self-ap- 
proved samples with the highest values 
of F,. The remaining 10 identified 
samples included the six one-formant 
samples discussed earlier, to be seen 
along the 45° line in Figure 1. None 
of these six, however, was among the 
four top-rated samples that made up the 
preferred group. Thus it might be said 
that high identifiability of />/ seems to 
obtain when F, approaches F,, that is, 
such a sample is not ambiguous, but that 
samples in which the proximity is too 
close seem to be less preferred. 











Fairbanks, Grubb: 


The locations of the preferred areas 
in Figure 1 emphasize even more 
strongly that discreteness which was 
remarked in discussion of the identified 
areas. In general, as the criteria become 
more exacting the areas not only shrink, 
but seem to be drawing away from each 
other, so that the samples judged to be 
most representative of a vowel are close- 
ly bunched and the different vowels are 


Investigation of Vowel Formants 215 


spaced across the coordinate field. In 
order to show this clearly Figure 2 dis- 
plays only the preferred areas. In view 
of these results and the procedures from 
which they came, the mean frequencies 
of the formants for the preferred sam- 
ples as shown in Table 4 take on con- 
siderable interest. It is suggested that 
this set of means together with the areas 
in Figure 2 provide the closest approach 





3K 


ie 


ed a i al a a 





2K 

















SOO ! 











O 250 
20 F 


900 IK 


Figure 2. Frequency areas of Formants One and Two for preferred samples of vowels. 


From Figure 1. Values in cps. 








216 Journal of Speech and Hearing Research 





cc Ser 
gee 
\ 
~ 


cy -cartegy” wrens 


2K 





I 
We € 


| ~~ 








po 
N\ 


° 


oe 
oe 





Tae 2 || 


¢ SPEAKER 
° SPEAKER 


S00 ! 








A & 


B 











200 250 F 


900 IK 


Figure 3. Personal vowel systems of two individual speakers shown in relation to frequency 
areas of Formants One and Two for preferred samples of vowels. Areas from Figure 2. 
Speaker A, highest over-all ratings; Speaker B, lowest over-all ratings. 


so far to a standard model of the 
General American vowel system. 
Figure 3 was prepared in order to 
illustrate personal vowel systems, or 
individualistic configurations of formant 
combinations, in their relationships to 
the preferred areas. The samples of 
Speaker A were highest among the 
seven speakers in both identification 


scores and ratings, while those of Speak- 
er B were lowest. Only one sample per 
vowel for each speaker has been plotted 
in Figure 3, this sample being that one 
which the speaker himself ranked as 
more representative. In contrast to the 
foregoing discussion, which proceeded 
from judgment groups to formant 
measurements, attention now shifts to 











Fairbanks, Grubb: Investigation of Vowel Formants 217 


the judgments that result from particu- 
lar combinations of formant frequencies. 

As shown in Figure 3 the samples of 
Speaker A are not especially instructive 
except to illustrate close agreement be- 
tween speaker and observers; each was 
among the preferred samples. But per- 
haps this case is useful to exemplify 
the fact that the central tendency of 
the samples is not simply a statistical 
fiction. 

The samples of Speaker B as plotted 
in Figure 3 are informative and re- 
pay point-by-point study. Outstanding 
general characteristics are, first, the 
extremely small coordinate area used 
for the entire vowel system other than 
/2/, about one-fourth of the figure’s 
total area; second, the extremely narrow 
range (533-600 cps) within which the 
F, values of six samples are to be found; 
third, the concentration of four samples 
near /a/. As a matter of fact, the key 
vowel of this speaker’s vowel system, 
as Figure 3 shows vividly, is /a/. His 
intended sample of /a/ was so identi- 
fied by seven of the eight observers, 
had a median rating of 7.5 on the nine- 
point scale of representativeness, and 
was one of the four preferred samples 
of /a/. About it in Figure 3 are 
clustered three other points, represent- 
ing attempts at /z/, /a/, and /u/. But 
these three samples were identified as 
/s/ by eight, seven, and three observers, 
respectively, and rated 1.0, 1.5, and 2.5 
as examples of the vowels intended by 
the speaker. The plotting of the sample 
offered for /e/, unanimously labelled 
/s/, is a conspicuous instance of the 
untenability of the relative vowel 
theory. The ratio F,/F, for this sample 
is 2.24, very close to the average of the 
preferred samples of /z/, which is 2.26. 
Speaker B’s sample of /3/ was one of 


the one-formant samples, the lowest in 
frequency of the six; its identification 
score was seven, but its median rating 
was only 5.0. That is, it was easily 
identified, but not considered highly 
representative. As Figure 3 shows, the 
point for this sample is distant from the 
preferred area, but no ambiguity re- 
sulted because the displacement of the 
formants was not directed toward any 
other vowel area. The sample of /u/ 
was identified by all observers and was 
one of the preferred group. As may be 
seen in Figure 3, the remaining three 
samples, offered for /i/, /1/ and /e/, 
were consistently displaced upward with 
respect to F, and confined within an 
extremely narrow range of F,. Their 
respective identification scores were 3, 
0, and 4, and their median ratings were 
4.0, 2.5, and 4.5. Distributions of identi- 
fications were unremarkable except for 
the /1/ sample, which seven observers 
identified as /u/. Since the point is very 
close to the /e/ area, as may be seen, 
although the sample resembles /u/ in 
F,, this constitutes a discrepancy be- 
tween formant combination and identi- 
fication, notably rare, which suggests 
possible operation of identifying cues 
other than those afforded by F, and F,, 
perhaps in the amplitude domain. 


Formant Three. The results of the 
frequency measurements of the third 
formant are summarized by the set of 
means in Table 4. The individual values 
of F, were studied in relation to the 
judgmental data in the same manner as 
for F, and F,. Plots of F, with F, and 
with F, showed the same progressive 
narrowing of areas from self-approved 
to identified to preferred samples as 
has been reported above for the plots 
of the two lower formants. However, 





218 Journal of Speech and Hearing Research 





3K 


Ty 


Po age 
aor, 

aby: 
| ie 


a 





é | a. s T | ee ee 








Mm 


























gr re 
Fs 


Figure 4. Frequency areas of Formants Two 
and Three for preferred samples of vowels. 
Values in cps. 


overlapping of ranges was very much 
more extensive in the F, dimension. In 
general location, by either of the co- 
ordinate combinations, the areas re- 
sembled those plotted by Potter and 
Peterson (4) for eight adults. 


The nature of the results with the 
third formant may be illustrated by 


Figure 4, which shows the preferred 
areas in a coordinate plot of F, and F,. 
Although this most extreme of the re- 
strictions has been accompanied by 
considerable separation of areas, exami- 
nation of the areas will show that most 
of this is attributable to F,. In fact, a 
vertical line at about 2 550 cps would 
pass through all areas except that of 
/i/. The areas overlap at three points, 
in contrast to Figure 1, where even the 
less powerful restriction to identifiable 
samples resulted in separation of all 
areas. The plot of F; and F; was even 
less effective in resolving the vowels. 
The shrinking of range with application 
of the identified and preferred criteria 
suggests that F, gives some information, 
for example, may denote a subclass of 
vowels, but in combination either with 
F, or with F, alone the vowel is not 
completely specified. The frequency of 
the third formant may make its most 
important contribution by participating 
in the distinction between /i/ and the 
non-/i/ vowels. 


Summary 


Nine General American vowels were 
sustained by seven skilled speakers at 
approximately the same fundamental 
frequency. Steady-state samples, two 
for each vowel, each 0.3 sec in duration, 
were recorded, individually self-ap- 
proved by each speaker as representa- 
tive of the intended vowel, randomized, 
approximately equated in level, and 
presented twice to a group of eight 
trained observers. At the first presenta- 


‘tion the observers attempted vowel 


identification. At the second presenta- 
tion they rated the samples on a scale of 
representativeness, knowing the vowel 
which each sample was intended to 








aA fF DHA Hh 02s «=| rE TH CAR CA 








Fairbanks, Grubb: 


represent. The frequencies of the form- 
ants were measured and studied in 
relation to the judgments, with the 
following major findings. 

a. Coordinate plots of the first two 
formants were conventional in their 
general locations, but the influence of 
the selective sampling could be observed 
in the areas for the nine vowels. These 
were smaller and overlapped less ex- 
tensively than the areas for unselected 
samples reported in previous studies. 

When the samples were restricted to 
those which had been correctly identi- 
fied by 75% or more of the observers, 
the areas were considerably smaller and 
mutually exclusive, so that specifica- 
tion of F, and F, differentiated any 
area from all others. In most instances 
the ratio F,/F, for a given vowel had 
a range in common with that for at least 
one other vowel, which should not be 
the case if the relative vowel theory 
is a complete explanation. 

When the identified samples were 
further restricted to those judged to be 
most representative of the respective 
vowels, most of the areas became very 
small. However, even with these small 
areas adequate differentiation between 
them was not given by either F,, F, 
or F,/F, alone. 

b. Although associations between 
self-approval, identifiability, and repre- 
sentativeness were close, the relation- 
ships were not perfect. Self-approval 
by an expert did not insure identifiabili- 
ty. Identifiability was not invariably 
accompanied by high judged repre- 
sentativeness; it seemed to be a neces- 
sary but not sufficient condition 
therefor. When a given vowel sample 
was misidentified or judged to be un- 


Investigation of Vowel Formants 219 


representative, a plausibly related devi- 
ation of F, or F, was in most cases 
present. Most of the atypical values of 
F, or F, were reflected in the judg- 
ments. 


c. Study of the third formant con- 
firmed the results of past investigations 
to the effect that F, is a much less 
powerful determinant of acoustic vowel- 
ness than either of the lower two 
formants. As the judgmental restric- 
tions were applied, the ranges of F, 
tended to decrease progressively, sug- 
gesting that some information is con- 
tributed. When the most rigorous 
restriction was exerted, F, was higher 
for the samples of /i/ than for almost 
all of the preferred samples of other 
vowels, such vowels being left with a 
common range of F;. 

d. In some of the samples of /3/, 
only one formant could be distinguished 
in the lower range. Although such one- 
formant samples were readily identi- 
fied, they were not among the group 
judged to be more representative. 


References 


1. Cranpaty, I. B., The sounds of speech. 
Bell Syst. tech. J., 4, 1925, 586-626. 

. Korenic, W., A new frequency scale for 
acoustic measurements. Bel] Lab. Rec., 27, 
1949, 299-301. 

3. Pererson, G. E., and Barney, H. L., Con- 

trol methods used in a study of the vowels. 
J. acoust. Soc. Amer., 24, 1952, 175-184. 

4. Potrer, R. K., and Peterson, G. E., The 
representation of vowels and their move- 
ments. J. acoust. Soc. Amer., 20, 1948, 528- 
535. 

5. Porrer, R. K., and Sremserc, J. C., To- 
ward the specification of speech. J. 
acoust. Soc. Amer., 22, 1950, 807-820. 

6. Stevens, S. S., VoLKMANN, J., and New- 
man, E. B., A scale for the measurement 
of the psychological magnitude pitch. J. 
acoust. Soc. Amer., 8, 1937, 185-190. 


nN 





Dimensions of Language Performance in Aphasia 


LYLE V. JONES 


JOSEPH M. WEPMAN 


(Attempts to classify language deficit 
consequent to cortical insult are prom- 
inent in the literature concerning 
aphasia (for example, 5, 9). Critical of 
such classificatory systems is the recent 
paper by Schuell and Jenkins (7) who 
present evidence which they interpret 


to be ‘more compatible with the theory 


of a single dimension of language deficit 
than with the multi iple_ dimensions or 
topologies suggested _ in_the_ past.’ \Cur- 
rently, then, it is not clear whether 
qualitative distinctions can usefully be 
made among aphasia disorders. If they 
can, there appears no general agreement 
concerning the appropriate form of 
such distinctions. 





Lyle V. Jones (Ph.D., Stanford University, 
1950) is Professor of Psychology and Direc- 
tor, Psychometric Laboratory, University of 
North Carolina. Joseph M. Wepman (Ph.D., 
University of Chicago, 1948) is Associate 
Professor of Psychology and Surgery and 
Director, Speech and Language Clinic, Uni- 
versity of Chicago. The paper is a product 
of the collaboration between the Psycho- 
metric Laboratory of the University of 
North Carolina and the Speech and Language 
Clinic of the University of Chicago. Different 
phases of the research reported were sup- 
ported by grants from the Department of 


Health, Education, and Welfare through - 


(a) the National Institutes of Health, Neuro- 
logical Diseases and Blindness Council (Grant 
B-710), (b) the National Institutes of Mental 
Health (Grants M-1849 and M-1876), and 
(c) the Office of Vocational Rehabilitation 
(Grant 168). 


Volume 4, No. 3 


Before embarking upon an explora- 
tion of the nature of language per- 
formance in brain-damaged patients, it 
is imperative to define the language act, 
as well as to ‘specify carefully a popula- 
tion of subjects to which conclusions 


pereain. The primary source of differ- 


viewing performance in in aphasia clearly 
appears to be one of diversity of 
definition, either of the domain of 
language performance or r of the criteria 


of aphasia. 


Every complete language act consists 
of a stimulating situation, a system of 
central processes, and a response. The 
stimulating situation, or input, may be 
actor initiated or initiated by means 
external to the actor. The sensory mode 
of stimulation may be dominantly 
visual, aural, kinesthetic, somesthetic, 
etc. The mode of response may be 
graphic, oral, or gestural. The content 
of response is attributable both to 
characteristics of the stimulus and to 
the interpreting and mediating central 
processes. 


Systematic investigation of the na- 
ture of residual language skills follow- 
ing cortical damage would appear to 
provide unique advantages for under- 
standing normal as well as disturbed 
language processes. The extent to which 


220 September 1961 








Pe ee ee ee ag ee 


a a ee 


>- =——— —s| -— Fo A 45 CF FRO ws we FO OF ee ee llr lUCULUCS 





Jones, Wepman: Language Performance in Apbasia 221 


Taste 1. Categories tested and examples of each category in both visual and auditory modalities. 














Visual Auditory 
Category Example Category Example 
Pictures baby Consonant Sounds p 
Letters M Letter Names dee 
Geometric Forms fo) Geometric Forms circle 
Number, Arabic 4 Numbers four 
Number, Printed Four 
Arithmetic Signs + 
Words Bread Words man 
Special Words You Special Words at 
Sentences Sentences 
a. 3-word Women prepare food. a. 3-word He has that. 
b. 5-word He has it for us. b. 5-word She takes her dog out. 


Tell-a-Story Picture 








clusters of language skills are con- 
comitantly affected by brain damage 
should cast light on the organization 
of these skills in successful performance 
of language acts. 

Factor analysis is a statistical tool 
appropriate for defining such clusters of 
skills. Just as factor analysis has served 
to discover separate verbal abilities for 
normal subjects (for example, 2), so it 
may be possible to identify ‘islands’ of 
language performance which sometimes 
remain intact in brain damaged aphasia 
patients.\ For this purpose, specially 


devised teSts must be developed, assess-_ 


ing a variety of language acts. To be 
differentially sensitive to possible dif- 
ferences among aphasia patients the 
level of difficulty of tests must be 
established appropriately. Test materials 
should be of sufficient ease so that 
among brain intact adults performance 
would be near perfect. Among brain 
damaged adults, in order optimally 
to distinguish among different levels of 
performance, the proportion of subjects 
passing a typical om should be nearer 
5 than either 0 or 1, 


— 


Test Materials 


A battery of test materials was con- 
structed to meet these objectives. _/ A_ 
film strip is used for presenting all visual 
materials. ‘Auditory stimulus items are 


"presented orally by the examiner. 


Specification of types of items presented 
appears in Table 1. Kinds of responses 
utilized for the various stimuli 
listed below: 


are 


Visual Stimuli 


* a. Reads or names 


b. Writes word or name 


c. Matches word to picture or picture 
to word 


d. Matches word to word or picture to 
picture 


Auditory Stimuli 
. Repeats after examiner 
. Writes word spoken by examiner 
. Matches spoken word to picture 
. Matches spoken word to printed word 


aot pap 


A shortened version of the test battery, 
the _ Language Modalities Test for 
Aphasia (8), has been prepared for 





222 Journal of Speech and Hearing Research 


general use in the assessment of aphasic 
disorders. 


Subjects 


Results reported here are based upon 
. the test performance of 168 selected 
subjects from 19 different hospitals 
and clinics throughout the northeastern 
United States. Subjects were selected 
only if (a) the medical record displayed 
attested medical diagnosis of brain 
damage; (b) the hospital or clinic rec- 
ord noted language disability con- 
sequent to the brain damage; (c) 
subjects retained sufficient ability to 
follow test instructions; (d) subjects 
displayed no more than mild dysarthria. 
Evidence regarding (c) and (d) was 
obtained from an interview with the 
subject and administration of a screen- 
ing test. 


The patients studied included 147 
male and 21 female; 124 were in- 
patients, while the remaining 44 were 
out-patients. The cause of brain damage 
had been diagnosed as cerebrovascular 
accident for 130 cases, external trauma 
for 29 cases, and as tumor extirpation 
for six cases. No cause was listed for 
three patients. Right hemiparesis was 
reported for 130 subjects, left- hem- 
iparesis for five, bilateral paralysis for 
three, no paralysis for 25, while the 
medical records of the remaining five 
gave no information concerning paral- 
ysis. In terms of premorbid educational 
level, 15 were college graduates, 46 
completed high school but not college, 
51 completed grade school but not 
high school, and 23 failed to complete 
grade school. Educational history of 33 
subjects is unknown. Subjects ranged 
in age from 14 to 76, with mean age of 
51; standard deviation of 15. 


Scaling of Responses 


Responses from aphasia _ patients, 
when in error, nevertheless display a 
range of distinct qualitative character- 
istics. A. response may be judged 
erroneous as a result of faulty articula- 
tion, of the appearance of an inappro- 
priate word or phase, or of simple 
refusal of the patient to offer a response. 
To accommodate for such qualitative 
differences, the following scoring cate- 
gories were utilized for each oral or 
graphic response to test items:? 


Code Values 
Number 


Correct response 
Self-corrected response 
Single articulatory or spelling error 


Two or more articulatory or spelling 
errors (response recognizable as cor- 
rect word or number) 


In-class word substitution 
Out-of-class word substitution 
Jargon or unintelligible response 
Automatic response 

No response 


aWwWne 


woOomOnN A” 


For the purpose of precisely specifying 
the characteristics of language loss in 
aphasia patients, maintenance of such 
distinctions regarding form of errors 
is thought to be equally important 
with recognition of the mode of stimuli 
and form of response for which errors 
occur. However, to factor analyze a 
set of variates representing test per- 
formance, quantitative rather than qual- 
itative scores are required. 


To derive quantitative indices of 
performance, the categories of response 


In the revised Language Modalities Test 
for Aphasia, scoring categories have been 
redefined and collapsed, so that six rather 
than nine are used. 














OV OOAaA Oa Ga 








Jones, Wepman: Language Performance in Aphasia 223 


Tasxe 2. Scale values of the nine scoring categories (C,, C:,...C,) determined from each of 
10. classes of test items involving pictures (P), figures (F), numbers (N), letters (L), signs 
(Sn), words and sentences (W-S), and sounds (Sd); number of items in each of the 10 
classes; and correlation ratio for each of the 10 classes. 














C Visual Stimuli Aural Stimuli 
Oral Graphic Oral Graphic 
Response Response* Response Response 
P FNLSn W-S P FNLSn W-S SdFNL W-S FNL W-S 
Number of Items 
8 36 96 8 20 16 24 72 82 36 
Scale Values 
C, 567 608 586 659 430 476 496 ~=—-.568 593 664 
C, 132 136 122 090 023 .050 027 537 106 048 
C; .040 049 134 232 —.099 —.057 066 020 —.001 145 
Cy 012 —.027 025 112 —.044 —.152 —023 —.061 —.010 .060 
C; 089 .038 O11 081 —.047 .004 —.085 .007 —.018 018 
C; —.175 —.101 —.084 —.100 —.026 —.041 —.133 —.118 —.057 —.026 
Cc, —459 —401 —.266 —.294 —496 —.486 —436 —.489 —.294 —.034 
C; —.363 —328 —.297 —.234 —.286 —.057 —.034 
C, —.528 —.574 —.676 —.623 —.744 —.712 —.692 —.579 —.738 —.702 
Correlation Ratio 
721 637 ~—.768 661 O95 | dee 618 611 483 ©=.659 




















*No graphic responses to visual stimuli were scored in category 8.: 


were subjected to analysis by the 
method of optimal scaling (1, 3). The 
method depends upon selecting scale 
values for the response categories which 
maximize the correlation ratio for the 
sum of squares between patients to 
the total sum of squares computed 
from these scale values. The resulting 
scale, then, has the effect of discrim- 
inating optimally between subjects, in 
the sense that between-subject variance 
is maximum relative to within-subject 
variance. 


Scaling was accomplished separately 
for various subgroups of test items (see 
Table 2). For the purpose of deriving 
scale values for response categories, the 
within-subject variance was defined 
over all items within each item class. 


Since the test was administered in two 
forms, presented on successive days, 
the within-subject variance includes 
variance due to the sampling of items, 
variance due to instability of perform- 
ance during a test session, and day-to- 
day variability of a subject. The 
within-subject variance thus is a good 
measure of the variance which detracts 
from test reliability. Indeed, the scaling 
method may be regarded as one which 
maximizes reliability of the subtests 
(item classes). 


From results of the scaling analysis, 
Table 2, it is seen that the correlation 
ratios are sufficiently high to make 
possible clear quantitative distinctions 
among subjects’ performances. In most 
cases, the order of size of the scale 





224 Journal of Speech and Hearing Research 


values corresponds closely to that con- 
jectured when category designations 
(1, 2,..., 9) were assigned. The one 
consistent discrepancy is the inversion 
of scale values for categories 7 (jargon) 
and 8 (automatic phrases). For auditory 
stimuli especially, 7 is closer to 9 (no 
response) than is 8. This would indicate 
that the tendency toward jargon re- 
sponse appears frequently in conjunc- 
tion with the tendency toward failure 
to respond. Jargon and response failure 
are more often associated within the 
same subjects than automatic phrases 
and failure of response. 


Other departures from the conjec- 
tured ordering are confined to cate- 
gories 2, 3, 4, 5, and 6, and are not 
consistent from scale to scale. The 
scale values for these categoriés are 
close together numerically, when com- 
pared with the full range of the scale, 
and it is apparent that the large con- 
tribution to discrimination is being 
made by categories 1 and 9, ‘correct’ 
and ‘no response.’ Taking into account 
the smaller contribution of the middle 
categories, it may be concluded that 
the scales are sufficiently similar to 
justify adopting a single coarse scale 
to quantify all responses for purposes 
of item analysis: 


Category £234 8 6 7.4 
Scale Value 6 1100-1 -—4 -3 ~7 


It can be shown that the correlation 
ratios based upon this approximate 
scale are only slightly smaller than the 
maximal values displayed in Table 2. 
The economy of adopting a single 
one-digit scale for all item classes would 
appear to override the slight loss in 
discrimination among subjects. 


Item Difficulty 


Several measures of difficulty of item 
classes have been determined, both in 
terms of proportion of correct re- 
sponses and mean scale values of scor- 
ing categories.~The measures agree in 
distinguishing several item classes as 
markedly more difficult than others. For 
the subjects studied, the task of pro- 
viding written response to aurally pre- 
sented sentences is most difficult, with 
mean proportion of words correctly 
written, P == .25. Other difficult tasks 
are graphic response to pictures (de- 
manding the written name of whatever 
is pictured), written response to aural 
words, oral response to visually present- 
ed geometric forms and to arithmetic 
signs, with mean proportion correct, P, 
ranging from .26 to .29. It is to be 
noted that all of these item classes 
demand a translation from a given 
stimulus to a response mode of a 
distinct kind. The easiest items, on 
the other hand, involve -copying. of 
short visual stimuli (geometric forms, 


‘letters, arithmetic signs, numerals), P 


= .73, or repeating of short auditory 
stimuli (numbers, letters, prepositions, 
and pronouns), P = .66. 


Factor Analysis 


Two sets of factor analyses have been 
performed, one on 35 test variables, the 
other on 37 test variables. The lists of 
test variables overlap considerably; the 
second list differs from the first most 
importantly in its inclusion of items 
involving matching responses (typically 
matching a visual or aural language 
stimulus to one of four alternative 
pictures). Because of this addition, the 
37-variable problem will be discussed 











an 


oe 


ee ae a ee, ee aa oe lh ee ee ee 


ae ie + ae ae 








Jones, Wepman: Language Performance in Apbasia 225 


here. In no case is interpretation of 
results from the 35-variable study in- 
consistent with that from the 37-vari- 
able study. 


Four distinct forms of analysis were 
completed, based upon performance of 
the 168 subjects on the 37 variables 
(defined in Table 4) included in the 
study. Two were graphical, one or- 
thogonal and one oblique graphical 
solution having been obtained. Two 
were analytic, a varimax (orthogonal) 
solution (6) and an oblimin (oblique) 
solution.? Since the initial analyses were 
completed, still further solutions have 
been obtained, an oblimax solution and 
a hierarchical solution. 


For each of the test variables, a score 
was defined for each subject by sum- 
ming the scale values of his responses 
for all items within the class repre- 
sented by that variable. The distribu- 
tion of such scores then was normalized 
separately for each of the 37 variables. 
Product-moment correlation coefficients 
were computed among the 37 variables, 
with N = 168, utilizing the IBM 650. 
A principal axes factor solution then 
was obtained on the ORACLE, a digital 
computer located at Oak Ridge, Ten- 
nessee. The principal axes solution was 
based upon communality estimates 


by = 1 — 1/r*, 


where 7 is the diagonal element for 
variable i in the inverse of the 37 x 37 
complete matrix of correlations among 
the 37 variables (see 4, p. 89). 


Several criteria agree in suggesting 
that six principal axes are sufficient to 


*J. B. Carroll, Solution of the Oblimin 
Criterion for Oblique Rotation in Factor 
Analysis, unpublished manuscript, 1958. 


Taste 3. Principal axes factor loadings.* Vari- 
able identifications (VI) are given by number 
in the left hand column. 


4“ 











VI Factor 
1 2 3 + 5 6 
1 272 —462 280 -—219 —045 —064 
2 018 —366 —097 —060 147 —492 
3 913 213 —049 079 —195 —034 
4 879 287 —181 101 —115 —100 
5 904 256 —167 090 —136 004 
6 885 297 —160 090 —102 —044 
7 902 236 —161 077 —086 —058 
8 882 278 —204 095 -—071 —046 
9 893 261 —111 071 —110 068 
10 879 231 —141 093 —188 —031 
11 898 182 —158 096 —056 —052 
12 895 236 —151 048 014 —007 
13 891 225 —169 094 -—057 —042 
14 617 —517 181 409 —025 —030 
15 562 —604 179 397 004 008 
16 548 —541 273 374 064 —016 
17 520 —597 210 324 —018 —044 
18 623 —490 —070 -—187 —262 147 
19 625 —453 —154 —163 —149 010 
20 623 —396 —108 —195 070 —262 
21 720 —349 —171 -—220 —084 —037 
22 660 —362 —194 —297 —061 —072 
23 819 205 409 —097 086 —051 
24 774 285 426 —064 061 —012 
25 739 164 467 —139 029 023 
26 714 291 445 -066 027 —037 
27 749 323 429 —131 -—062 —058 
28 790 392 167 —040 107 027 
29 799 389 168 —106 118 006 
30 827 —212 -—216 028 324 093 
31 818 —156 —173 —044 316 177 
32 835 —026 —158 oll 428 084 
33 815 —056 -—210 018 378 074 
34 623 —291 —091 —283 068 —008 
35 648 —400 010 —062 —191 196 
36 675 —390 164 —279 —012 070 
37 508 —383 066 —078 -—189 143 
2a*/37_ 557 = «119s «050s «032, '—s«is025s«C«w4 








*Decimal points are omitted for all entries 
in the body of the table. 


account for nonrandom common var- 
iance for this problem. It is reassuring 
that the varimax solution was relatively 





226 Journal of Speech and Hearing Research 


Taste 4. Varimax factor matrix,* giving stimulus, response, variable identification by number, 
item, number of items, factor, and h”, in that order, left to right. 














Ss R VI Item hb? 
B EF 

1 Age —25 02 -—49 -—14 65 

2 Education —l1 16 59 64 
Vv O 3 Nouns 8 35 28 -—04 9% 
Vv. O 4 Verbs 8 26 19 04 9% 
V O 5 Adjectives 8 27 25 -—05 97 
V O 6 Abstr nouns 8 29 20 -—0l 96 
Vv O 7 Abstr verbs 8 28 24 02 95 
Vv O 8 Abstr adjectives 8 25 20 O01 9% 
Vv O 9 Pronouns, prepositions 8 32 25 —ll 95 
V O 10 Color names 4 27 26 —03 94 
Vv Oo 11 Names of sounds 4 26 25 03 94 
V O 12 Sentences} 23 31 23 OO %4 
V O 13 Sentences** 9 27 23 Ol 94 
V G 14 Nouns + 12 33: 07. 92 
V G 15 Verbs 4 07 36 «(06 93 
V G 16 Adjectives 4 18 30 «608 «=—(90 
Vv G 17 Pronouns, prepositions 4 10 3811s 88 
Vv M 18 Pictures 8 05 78 —06 87 
Vv M 19 Numbers 12 01 70 10 = 82 
Vv M 20 Arithmetic signs 8 13 58 39 82 
V M 21 Nouns, verbs, adj 12 10 69 15 85 
Vv M 22 Sentences 4 08 71 20 84 
A O 23 Nouns, verbs, adj 12 At 21 2 SS 
A O 24 Abstr nouns, verbs, adj 12 77 13 —05 93 
A O 25 Pronouns, prepositions 8 76 24 —06 90 
A O 26 Color names 4 76 11 —04 89 
A O 27 Names of sounds 4 78 16 —04 93 
A O 28 Sentences} 21 60 08 —06 91 
A O 29 Sentences** 11 63 12 -—03 92 
A G 30 Nouns, verbs, adj 12 13 42 09 94 
A G 31 Pronouns, prepositions 4 20 42 00 93 
A G 32 Sentences} 15 26 28 08 9% 
A G 33 Sentences** 5 19 30 «09 «93 
A M 34 Geometric forms 4 18 60 14 75 
A M 35 Numbers 4 12 64 —13 81 
A M 36 Nouns, verbs, adj 8 36 68 04 85 
A M 37 Sentences 4 12 57 -—09 69 

2a’; /37 138 162 019 .797 








*Decimal points are omitted for all entries in the body of the table. 
Only syntactic words from sentences. 
**Only semantic words from sentences. 











Jones, Wepman: Language Performance in Aphasia 227 


invariant for this study when the largest 
6, 7, 8, 9, or 20 principal axes were 
included for rotation. (Twenty char- 
acteristic roots of the 37 x 37 reduced 
correlation matrix were positive.) Pro- 
jections of the 37 variables upon the 
first six principal axes appear in Table 


Despite dissimilarities in sizes of 
loadings and in correlations among 
factors, all solutions are in essential 
agreement in displaying five clearly 
defined common factors and a sixth, 
doublet factor. 


Simple structure is most striking for 
the oblique solutions. However, the 
sizable correlations among factors lead | 
to difficulties of interpretation, and the 
orthogonal varimax solution is presented 
here.’ The solution appears in Table 4. 


(In the following informal tables and in Table 
4 these abbreviations are used: variable iden- 
tification by number, VI; stimulus, S; re- 
sponse, R; factor loading, Ldg; aural stimu- 
lus, A; visual stimulus, V; oral response, O; 
graphic response, G; matching response, M.) 


Factor A. For all analyses, Factor A 
is defined best by all items demanding 
oral response to visual stimuli. (For 
the oblique solutions, no other classes 
of items are prominently represented 
on the factor.) Since such items formed 
the most numerous item class, this 
factor accounts for more common 
variance than does any other, 40% of 
common factor variance, 32% of total 
variance. The defining items, together 
with their loadings as obtained on the 
varimax solution, are listed below. 


*Results of the other analyses are available 
from the authors upon request. The varimax 
solution given in Table 6 is that which re- 
sulted from an input of six principal axes. 


\ see R Item Ldg 
4 V O ‘Verbs 89 
5 V O_- Adjectives 88 
6 V Oo Abstr nouns 88 
8 VO _ Abstr adjectives 88 
7 vi O Abstr verbs 86 

10 V O Color names 85 

13 V O Sentences* 84 
9 VO Pronouns, prepositions  .83 
3 V O Nouns 83 

11 V O Names of sounds 82 

12 VO _ Sentences} 81 

28 A O Sentences} 63 

29 A O Sentences* 61 

33 A G Sentences* 5S 

32 A G __ Sentences} 54 

30 A G Nouns, verbs, adjectives .50 

27 A O Names of sounds 48 

24 A O Abstr nouns, verbs, adj .47 

31 A  G __ Pronouns, prepositions 47 

23 A O Nouns, verbs, adj 46 

26 A Oo Color names 43 

21 Vv M_ Nouns, verbs, adj 40 


*Only semantic words for sentences. 
Only syntactic words for sentences. 


All items which involve oral response 
to visual stimuli display loadings on 
Factor A in excess of .80. It is tempting 
to interpret the factor as one repre- 
senting the transmission and translation 
from visual stimulus to oral response. 


Note that, following the visual-oral 
items in order of size of loadings on 
Factor A are the four items demanding 
oral or graphic response to aurally 
presented sentences, all of which exhibit 
loadings in the range .55 to .62. This 
finding is consistent with the interpreta- 
tion of Factor A as a visual-oral trans- 
mission factor if such transmission can 
be considered to play a function in 
oral or graphic response to sentences. 
The finding suggests that the prob- 
ability of correct oral repetition of an 
aurally presented sentence is enhanced 
for individuals able to maintain a visual 
image of the sentence after hearing it. 
Only then could the visual-oral trans- 





228 Journal of Speech and Hearing Research 


mission function be relevant to the task. 
In addition, that graphic response to 
spoken sentences is represented on the 
factor suggests the functions both of 
visual retention and of subvocal oral 
response as possible intervening steps 
between aural stimulation and written 
response. (Even on the oblique factor 
solutions, these variables, particularly 
28 and 29 which entail oral response to 
aurally presented sentences, exhibit 
positive projections on the vector 
representing Factor A.) 


Many of the remaining variables 
exhibit substantial Factor A loadings on 
the orthogonal varimax solution. This 
represents some tendency toward a 
general factor, in the sense that all tasks 
on the test tend to be positively cor- 
related. Seven of the 37 variables, how- 
ever, exhibit loadings less than .20. 


Factor B. Factor B contributes 18% 
of common variance, 14% of total var- 
iance, and is represented primarily by 
the variables shown below. 


VI S R Item Ldg 
ZV A O Names of sounds a 

23 A O Nouns, verbs, adj 77 
24 A O Abstr nouns, verbs, adj .77 
25 A  O_ Pronouns, prepositions  .76 
26 A O Color names = 6 
29 A O Sentences* 63 
28 A O _ Sentences} 60 


*Only semantic words from sentences, 
Only syntactic words from sentences. 


Those variables with highest loadings 
on Factor B include those variables de- 
manding oral response to aural stimuli. 
All such variables exhibit loadings of 
.60 or greater; no other variable exhibits 
a loading as great as .40. 

Factor B would seem to represent 
the transmission of aurally received 
stimuli into orally produced response. 


» That this transmission ability can 
play a part in the success of patients on 
items other than those which require 
oral response to aural stimuli is sug- 
gested by loadings of items requiring 
oral response to visual stimuli (loadings 
of .25 to .36), of items requiring match- 
ing of visual alternatives to aurally 
~presented words (.36), and of items 
requiring graphic response to aurally 
presented sentences (.19 and .26). In 
each case, it is tenable that by ‘re- 
auditorization’ patients would better be 
able to perform the assigned tasks, and 
that aural to oral transmission is neces- 
sary for this process. The test variables 
which exhibit near-zero projections 
upon Factor B include only those 
involving graphic response to visual 
stimuli or matching response to visual 
stimuli, tasks in which the aural-oral 
channel could not be anticipated to 
play a prominent role. 


Factor C. Factor C accounts for 14% 
of the common-factor variance, 11% 
of total variance. Only four variables 
exhibit loadings in excess of .40, and 
these exhaust the items which demand 
copying of visual stimuli: 


VI S R Item Ldg 
15 V G_ Verbs 83 
14 VG _ Nouns 81 
14 V G__ Adjectives 81 
17. VG __ Pronouns, prepositions .77 


Factor C apparently involves trans- 
mission from visual stimuli to graphic 
response. 


Other variables, with loadings be- 
tween .22 and .39 include the majority 
of the matching variables, both auditory 
and visual stimuli, as well as all variables 
demanding graphic response to auditory 























sti 


au 
th 
vi 


22 
19 
21 
36 
35 
34 
20 
37 


30 














Jones, Wepman: Language Performance in Aphasia 229 


stimuli. Conjecturing the possibility of 
a visual image of a word instigated by 
aural stimulation, it is reasonable that 
these tasks might be aided by intact 
visual-graphic transmission. 


Factor D. The four variables with 
projections on Factor D in excess of 
.40 are listed below. Factor D accounts 
for approximately 7% of common 
variance, 6% of total variance. 


VI S R Item Ldg 
 ) ale = G___Sentences* 65 
33 A  G — Sentences} 62 
31 A  G_- Pronouns, prepositions .60 
30 «OA G Nouns, verbs, adj 59 


*Only syntactic words from sentences. 
Only semantic words from sentences. 


This list includes all test items which 
involve written response to aural stim- 
ulation. It appears to reflect aural to 
graphic transmission. No other variables 
display loadings greater than .25. 


Factor E. Factor E accounts for 18% 
of common factor variance, 15% of 
total variance, second in prominence 
only to Factor A. The variables which 
display loadings of .40 or greater are: 


vI Ss R Item Ldg 
18 V OM _ Pictures 78 
22 V M Sentences 71 
19 V =M _-~ Numbers 70 
21 V M Nouns, verbs, adj 69 
36 «CUA M Nouns, verbs, adj 68 
35 A  M Numbers 64 
34 A  M _~— Geometric forms 60 
2«Ves«C«édM:*«CSCéAA rrithhmneettic signs 58 
37 A M Sentences 57 

1 Age —A9 
31 A  G _ Pronouns, prepositions .43 
30 A G Nouns, verbs, adj 42 


All nine matching tests head the list 
of defining variables for Factor E; five 
tests require matching of visual stimuli 
to visually presented alternatives, while 
four require matching of auditory 
stimuli to visually presented alterna- 
tives. These variables have in common 
their dependence upon the need for 
comprehension of symbols. The task 
demands correct matching of a verbal 
symbol to a pictorial representation of 
that denoted by the symbol. Variables 
30 and 31 are the only two which 
require written response to spoken 
words. With comprehension of word 
meaning, it is tenable that the likeli- 
hood of correct response is enhanced. 

Factor E clearly represents an ability 
to comprehend language symbols, since 
aural as well as visual stimuli are prom- 
inently involved. Factor E, in a central 
way, represents aphasia as distinct from 
the agnosias and apraxias which depend 
upon input-output modalities. 

Factor E is the only factor on which 
age is prominently represented. The 
loading of age, —.49, indicates a tend- 
ency for older patients to perform 
less well on the tasks which define the 
factor. The finding may reflect the 
general deterioration of comprehension 
abilities which accompanies increasing 
age of the nervous system, an effect 
probably accentuated by cortical dam- 
age. 

It is instructive to note, from Table 
4, the nature of other variables with 
nonzero projections on Factor E. Load- 
ings between .20 and .40 are displayed 
by several variables involving graphic 
response to visually or aurally presented 
words and sentences, as well as variables 
which require oral response to visually 
presented words or sentences. In the 
case of each such variable, it is tenable 





230 Journal of Speech and Hearing Research 


that comprehension of the stimulus, 
understanding the words, would en- 
hance the likelihood of correct re- 
sponse. 


Factor F. Accounting for only about 
2% of the variance, Factor F displays 
only two appreciable loadings: 


VI S§ R Item Ldg 
2 Education 59 
20 V M Arithmetic signs 39 


Variable 20 requires selection of cor- 
rect answers among alternative choices 
presented as possible solutions to arith- 
metic problems, also presented visually. 
Addition, subtraction, multiplication, 
and division are represented, each indi- 
cated by its conventional sign (+, —, 
x, +). This arithmetic variable appears, 
with all other matching tests, on Factor 
E, but it alone is displayed on Factor 
F, together with number of years of 
premorbid education. 

One might view Factor F as a rather 
trivial doublet, representing a relation- 
ship between education and arithmetic, 
expected to appear for intact individuals 
as well as aphasic patients. Such a view 
probably is in error. The arithmetic 
problems presented are so simple that 
individuals with even some grade school 
experience would be anticipated to 
perform correctly. There is little ques- 
tion, however, that among brain-intact 
subjects the tasks would be performed 
with greater alacrity by those with 
greater educational experience. It would 
appear from these results that such 
individuals are less likely to lose their 
arithmetic skills after brain damage than 
are individuals whose education was 
less and whose initial arithmetic experi- 
ence may have been narrow. 


Remaining Factors. No principal axes 
factor beyond the sixth accounts for 
more than 1% of common factor 
variance, and all may be considered 
residual factors. 


Discussion 


[ Results of this study clearly demon- 
strate the existence of several dimen- 
sions which underlie test performance 
of aphasic patients, and thus argue 
against the hypothesis that language 
disturbance after brain damage may be 
-viewed as a unitary, general disorder. 


_ It follows from the findings that a single 


“test score is not a sufficient indicant of 
degree of aphasic disorder), Performance 
of aphasic patients may differ in kind 
as well as in extent. As a consequence, 
useful information is provided by 
separate scores representing the possible 
transmission channels which mediate 
language acts, and a separate ‘compre- 
hension’ score also will be of value. 
‘The correlations among these scores 
will be far less than unity in the pop- 
ulation of aphasic patients from which 
the 168 cases of the present study were 
selected. > 


The results are in sharp contrast to 
those presented by Schuell and Jenkins 
(7), and critical assessment of the re- 
sults of those investigators thus is 
appropriate. The Schuell and Jenkins 
conclusions appear to result from two 
major sources: (a) a highly restricted 
selection of tests upon which the pri- 
mary analysis is based, and (b) an 
inappropriate method of analysis used 
‘to test for ‘the existence of a single 
hierarchy of language functions.’ 

Schuell and Jenkins analyzed _per- 
formance of 100 aphasic patients on 
29 tests, seven of which involved visual 








ic 








Jones, Wepman: Language Performance in Apbasia 231 


presentation of stimuli, 22 of which 
depended upon aural stimulation. For 
the major analysis, 18 of these tests 
were selected; the criterion for selec- 
tion was high correlation of a test with 
the remainder of the battery. Remain- 
ing among the 18 were only three tests 
which required visual presentation, 15 
of which required aural stimuli. The 
criterion for selecting among the 29 
tests (already designed to emphasize 
certain functions, for example, match- 
ing to aural stimulation) clearly served 
to further increase the homogeneity of 
the test materials. The functions re- 
quired on the 18 tests do not adequately 
represent the various transmission func- 
tions (11 of the 18 tests require a 
matching response to aural stimulation, 
and no more than two tests require use 
of any of the transmission channels 
defined by Factors A through D of the 
present study). Thus the selection of 
tests predisposed towards finding a 
single, general aphasia factor. 


Whether such a single factor would 
emerge from the 18 tests remains un- 
clear. No analysis was performed which 
yields an appropriate test of this. As 
Carroll* has demonstrated, the Guttman 
scale analysis which was employed by 
Schuell and Jenkins is inappropriate for 
an investigation of the dimensionality 
of a psychological domain. A high 
‘coefficient of reproducibility’ may be 
obtained from factorially complex tests, 
even when multiple factors are repre- 
sented. 


While test selection and methodolog- 
ical choice are sufficient reasons to 


‘J. B. Carroll, A Note on Reproducibility 
and Dimensionality, unpublished manuscript, 
1958. 


question the generality of the Schuell 
and Jenkins findings, other difficulties 
remain to be briefly noted. (a) Of the 
100 patients, 17 failed essentially all 
items presented them. The failure to 
reject these patients beforehand serves 
artificially to accentuate the relations 
among the tests. (b) Difficulty of tests 
was rarely moderate for those selected 
for the major analysis. The distribu- 
tions of scores suggest that some items 
were too easy, others were too difficult 
to enable sharp discrimination among 
patients or among distinct components 
of language tasks. (c) A score of one 
was assigned if a test was performed 
without error, otherwise (if one or 
more errors occurred) zero was as- 
signed. The failure to discriminate 
degree of error may further serve to 
camouflage distinctions among language 
functions. 


Summary 


Six factors were found sufficiently 
prominent to warrant interpretation 
when intercorrelations among 37 var- 
iables on an aphasic sample were sub- 
jected to varimax factor solution. Of 
these six factors, four appear to 
represent input-output transmission 
functions: Factor A, visual to oral 
transmission; Factor B, aural to oral 
transmission; Factor C, visual to graphic 
transmission; Factor D, aural to graphic 
transmission; Factor E, in contrast, 
transcends stimulus modality. It is de- 
fined by variables which require the 
matching, from one symbol system to 
another, of symbols with equivalent 
meaning. Factor E is interpreted as 
representing ability to comprehend 
language symbols. Factor F appears to 
represent ability to perform simple 





232 Journal of Speech and Hearing Research 


arithmetic operations, a function strong- 
ly affected by the educational attain- 
ment of the patient. 

It should be emphasized that 
essentially the same factors appeared 
from several alternative factor solutions. 
The solution presented here is en- 
tirely analytic, and involved no sub- 
jective judgments of the investigators 
other than those required in interpret- 
ing and deriving conclusions from the 
final results. 

The finding of relatively distinct 
interpretable factors of aphasic test 
performance strongly suggests that var- 
iables corresponding to the factors 
should be given explicit attention when 
testing for aphasia. Without adopting 
this strategy, two patients might receive 
the same score on a composite test 
for entirely different reasons. Only by 
distinguishing performance on each of 
the abilities reflected by the factors can 
one learn relative strengths and weak- 
nesses in language performance of 
aphasic patients, information vital to 
useful diagnosis and treatment. 


Acknowledgments 


The authors are extremely grateful 
to Dr. Henry F. Kaiser, University of 
Illinois, for supervising computational 
aspects of varimax solutions on the 
IBM 701, University of California at 
Berkeley, and on the Illiac, University 
of Illinois, as well as to Dr. John B. 
Carroll, Harvard University, for super- 


vising computational aspects of the 
oblimin solution on the IBM 704, 
Massachusetts Institute of Technology. 
The oblimax solution and a hierarchical 
solution on the 37 variables were per- 
formed by Mr. John Horn, University 
of Illinois. (Harman, 4, provides ex- 
cellent discussion of the alternative 
factor methods.) 

The authors are indebted to the Oak 
Ridge Institute for Nuclear Studies, and 
to Mr. C. T. Fike, ORACLE Applica- 
tions Program, University Relations 
Division, for supervision of coding and 
computations of the principal axes 
factor solution. 


References 


1. Bock, R. D., Methods and applications of 
optimal Se Univ. North Carolina, 
Psychometric Lab. Rep., No. 25, 1960. 

2. Carrot, J. B., A factor analysis of verbal 
abilities. Psychometrika, 6, 1941, 279-308. 

3. Fisuer, R. A., Statistical Methods for Re- 
search Workers. (10th ed.) Edinburgh: 
Oliver and Boyd, 1946. 

4. Harman, H. H., Modern Factor Analy- 
sis. Chicago: Univ. Chicago Press, 1960. 

5.. Heap, H., Aphasia and Kindred Disorders 
of Speech. New York: Macmillan, 1926. 

6. Katser, H. F., The varimax criterion for 
analytic rotation in factor analysis. Psy- 
chometrika, 23, 1958, 187-200. 

7. ScHUELL, Hivprep, and Jenkins, J. J., The 
nature of language’ deficit in aphasia. Psy- 
chol. Rev., 66, 1959, 45-67. 

8. Wepman, J. M., and Jones, L. V., The 
Language Modalities Test for Aphasia. 
Chicago: Univ. Chicago, Ind. Rel. Ctr, 
1961. 

9. WiesenserG, T., and McBruveg, K., Apha- 
sia: A Clinical and Psychological Study. 
New York: Commonwealth Fund, 1935. 


Editor’s Note: Comments on this article appear in Letters to the Editor, pages 295-299. 








Aas 


Aas 


e 


— =es3 8 fF et 6 et Fe Uh 


ms £9 


tl 


S°2 Qe2%2 aoateowam oe wWo ss. 


st 








Bilabial Stop and Nasal Consonants: 


a Motion Picture Study 


and its Acoustical Implications 


OSAMU FUJIMURA 


Although there have been some at- 
tempts to record the motion of the 
articulators during speech production 
(16, 17, chap. 2), few precise descrip- 
tions of these activities have been made. 
In particular, the rapid changes which 
occur in speech production are not well 
understood. This lack of knowledge 
becomes particularly apparent, for ex- 
ample, when an experimenter tries to 
produce a syllable such as /pa/ by 
means of a electrical analog (15), and 
he finds that he is unable to give in 
advance a detailed specification of the 
rates at which the component parts of 
the speech mechanism must move. 
Some information regarding articula- 
tory changes may be derivable from a 
study of formant changes revealed in 
spectrograms, but such information is 





Osamu Fujimura (BS., University of 
Tokyo, 1952) is Assistant Professor at the 
Research Institute of Communication Sci- 
ence, University of Electro-Communications, 
Tokyo, Japan, on leave as a staff member of 
the Research Laboratory of Electronics, 
Massachusetts Institute of Technology. A 
portion of the article is based on a paper 
presented at the 59th meeting of the Acous- 
tical Society of America, June 1960, Provi- 
dence, Rhode Island. The research was 
supported in part by the U. S. Army Signal 
Corps, the Air Force Office of Scientific Re- 
search, and the Office of Naval Research; and 
in part by the National Science Foundation. 


Volume 4, No. 3 


233 


quite limited, since the principal artic- 
ulatory change associated with a given 
formant shift may not be clear. 

It is difficult, of course, to measure 
all the variables which describe speech 
production, or to estimate the physical 
constants that may represent the parts 
of an analogous mechanical system, 
even if approximated by a simplified 
model. If, however, certain essential 
parts of articulatory behavior can be 
described accurately, then these de- 
scriptions can serve as a basis for appro- 
priate experiments in speech synthesis 
that will lead to specification of other 
aspects of the articulatory process. 

In the experiment to be described, 
close observations were made of the 
articulators in action. In order to in- 
troduce no aberration in the articula- 
tion, an optical method was selected for 
data collection. The particular artic- 
ulatory action to be observed in the 
experiment was the explosion phase of 
the production of bilabial stop and 
nasal consonants. 


Experimental Procedures 


The talker was illuminated inter- 
mittently by a stroboscopic technique 
(6). In the major part of the experi- 
ment motion pictures were taken at 240 
frames per second. The film was run 


September 1961 





234 Journal of Speech and Hearing Research 


continuously in a standard oscilloscope frame as provided by a stroboscopic 
camera with the subject seated ina dark light was several microseconds. The 
room. The exposure duration for each talker used a manual button to start the 









a  <—— TIME 





Ficure 1. Front views of the mouth of the subject pronouncing (a) pock, (b) bock, and 
(c) mock. All sequences represent the beginning phase of the mouth opening immediately 
after the plosion of the initial consonant. The time interval between adjacent frames 1s 
approximately 4 msec, and time proceeds from right to left. 




















Fujimura: Bilabial Stop and Nasal Consonants 235 


camera when he was about to utter a 
word, and this button also controlled 
the initiation of the stroboscope after 
a fixed delay time to allow the film to 
reach its appropriate constant speed. 
Simultaneously, the utterance was 
recorded on the first track of a high- 
quality dual-track magnetic tape re- 
corder, and the second track of the 
recorder received a pulse train which 
was also used to fire an argon lamp that 
provided time markers on the photo- 
graphic film. One native speaker of 
American English trained in phonetics 
generated all of the utterances. 

The words that were analyzed are 
as follows: 


pock, bock, mock, a pock, a bock, a mock, 
spock; 
peat, beat, meat, a peat, a beat, a meat, speech; 


pope, bope, mope, a pope, a bope, a mope, 
spope; 


PlOVes “oes. eR Ap pLOVen! os 
spruce. 
With one exception (the utterance 


spope), the utterances were English 
words, and were selected to provide a 
variety of vowels (/i/, /o/, and /a/)) 
after the consonants /p/, /b/, and /m/ 
Whenever possible each syllable wa 
terminated in a tense stop consonant, 
and the final consonants were selected 
to provide a minimal change in artic- 
ulation from the stressed vowel to the 
consonant. To test the influence of an 
unstressed initial syllable, words were 
prepared with a preceding schwa vowel. 
In order to observe the effect of some 
other environments on tense stop pro- 





duction, words containing /pr/ and 
/sp/ were included in the materials. 
The subject was familiar with the ma- 
terials before the recording session and 


pronounced each word at a rate thatV 
was natural for him. 


Before the data analyzed in the ex- 
periment were collected, preliminary 
experiments using various frame rates 
between 60 fps and 1 000 fps were per- 
formed to select an appropriate ex- 
posure rate. These experiments demon- 
strated that the initial opening of the 
lips occurred sufficiently rapidly to re- 
quire an interframe interval of less than 
5 msec for close observation of the 
process. Very fast exposures were im- 
practical, however, if a sizeable number 
of words was to be processed. Since it 
was deemed more important to com- 
pare the three consonants in various 
phonetic contexts rather than to ex- 
amine one or two utterances minutely, 
an exposure rate of 240 fps was selected 
as a compromise for the major part of 
the study. 


Results 


Figure 1 shows some typical front 
(full-face) views of the subject. The 
first frame (at the right) of the upper 
row is the last exposure before the 
plosion of the initial consonant in the 
word pock. The exposure taken 4 
msec later shows the lips separating, 
and the subsequent exposures show a 
continuation of this process. The second 
and third rows of exposures represent 
the production of bock and mock, re-- 
spectively. Comparison of the lip con- 
figurations for these three sequences 
reveals clear differences between the 
role of the lips in the production of 
bilabial nasal consonants and _ bilabial 
stop consonants. 


This gross difference is better under- 
stood - when some profile exposures 
taken as a supplement to the experi- 





236 Journal of Speech and Hearing Research 


3 2 
——— TIME 











Ficure 2. Profiles of the subject ‘pronouncing pock. The frame interval is approximately 4 


msec. Negative numbers of frames refer to the frames before the plosion of /p/. The frame 
at the extreme right corresponds to a time moment 330 msec before the plosion. 


ment are examined. Figure 2 presents a 
series of profile frames for the word 
pock. The number under each frame 
identifies that frame with respect to the 
exposure immediately preceding the 
opening of the lips; negative numbers 
identify exposures preceding this frame. 
Examination of the outlines of the lips 
shows that the lips are pushed forward 
before they are blown apart. As soon as 
the plosion occurs, the lips resume a re- 
laxed shape. The front views of /p/ 
and /b/ production (see Figure 1) 
reflect these changes in the upper lip, 
particularly in its apparent thickness, 
just before and after the plosion. 

The mechanism responsible for the 
«\ tissue deformation visible in the pic- 
tures is probably an overpressure in the 
air behind the obstruction formed by 
the lips in the stop phase of the pro- 
duction of stop-plosive consonants. 
When the pressure is released by the 
rapid opening of the closure, a highly 
damped oscillation of the lips occurs. 
(This oscillation will be evident in the 
quantitative data to be presented be- 
low.) Examination of the front views 
taken during the production of /m/ 
does not reveal similar changes in tissue 
configuration at the time of release. In 


this case the overpressure cannot be 
built up appreciably because the open 
nasal port provides a bypass for the air 
flow. 

Measurement of Mouth Opening. A 
number of simple measurements were 
made to provide some quantitative 
description of the articulatory processes 
under view. A photographic enlarger 



































; % T T T T T : 
a 
15F = SE, 
-——= ena 
So ss 
10 AL A 0 Iitl al 
Le b /pit/ 
sti eg ¢ /bit/ q 
= VV é a /spitf/ 
=o f l 4 ‘ 1 ape: 
wo 15 T T T T T T 
< (b) 
<< al 
uw lOF - Pani 
. ‘ ae a /a'mit/ 
2 7b b /a'bit/ 
° 5+ - A é 4 
< i; c /a'pit/ 
=o6 ‘ ! ' ! A i 1 
a. 
ul 10 T T T T T T T 
o— oa a 
5 Poe p-- at 
HH - a /sprus/ 
~~ 7 
0 = c b /a'pruv/ 
c /pruv/ 
te) 10 20 30 40 


FRAME NUMBER (x 4 = TIME IN MSEC) 
Figure 3. Plots of vertical distance across 
the lip opening as functions of time (frame 
number). Arrows indicate the onset times 
of glottal vibration, 











Fujimura 


was used to project the films frame by 
frame, and measurements were made on 
the projected image, which was slightly 
larger than life size. Care was taken to 
avoid parallax errors and other changes 
of the scale factor due to variation of 
the position of the mouth. In particular, 
distances between selected edges of the 
lower teeth were checked during the 
measurements by fitting the projected 
image to a carefully prepared drawing. 


~ Midsagittal Lip Separation. One vari- 


able used to describe the mouth opening 
was the vertical distance separating the 
lips at the center of the mouth, plotted 
as a function of time or frame number. 


Bilabial Stop and Nasal Consonants 237 


(The midsagittal lip separation, by the 
way, was not necessarily the maximum 
vertical distance between the lips.) 
Efforts were made to make all such 
measurements in one plane. 

As typical examples of the results, 
data for the words containing the 
stressed vowel /i/ are shown in the 
upper two sections of Figure 3. It is 
seen in the figure that the word starting 
with the tense stop (peat) shows a par- 
ticularly high speed of separation im- 
mediately after the plosion. After this 
rapid change, there is a period in which 
the change is markedly slower. This 
change of speed is characteristic of the 


TaBLE 1. Measures of the speed of lip opening for bilabial consonants in various contexts. 
The various measures of lip opening 5 msec after the explosion (the first column for each 
vowel) are compared with the values corresponding to the maximum opening phase of the 














vowel (the second column). 











Leading Stressed Vowel 
Sounds /i/ /o/ /a/ /ru/ 
(a) Vertical Separation (in.mm) 
/p/ : 4.0 15 3.0 14 4.5 15 1.5 8 
iby 3.0 13 2.0 12 3.0 17 
/m/ 2.0 14 2.0 10 5 | 18 
/ap/ 2.0 11 1.0 11 2.0 13 1.5 6 
>b/ 3.0 11 2.0 8 3.0 15 
/am/ 2.5 11 1.0 9 35 13 
/sp/ 0.5 12 ph 10 3.0 16 1.0 7 
(b) Horizontal Width Divided by 2 (in mm) 
/p/ 9.0 22 12.0 16 13.5 23 40 12 
/b/ 10.0 yy 10.0 14 13.0 23 
/m/ 7.0 22 9.0 14 10.5 22 
/ap/ 8.5 22 7.0 15 10.5 20 4.0 12 
/ab/ 8.0 21 6.5 15 9.5 21 
/am/ 6.5 21 5.5 14 10.0 20 
/sp/ 4.0 19 6.0 16 7.0 22 45 11 
(c) Calculated Area (in mm’) 
/p/ 28 260 28 180 48 190 5 75 
iby 24 220 16 130 31 310 
/m/ 11 240 14 110 29 310 
/ap/ 13 190 6 130 16 200 5 55 
/ab/ 19 180 10 90 22 170 
/am/ 13 180 4 100 27 200 
/sp/ 2 180 7 130 16 280 + 60 











238 Journal of Speech and Hearing Research 


words containing stops, except when 
the word is initiated with /s/. The in- 
itial speed, however, is much smaller for 
the tense stop preceded by a schwa (a 
peat) than for the initial tense stop 
(peat). This difference in the initial 
speed is not seen when beat and a beat 
or meat and a meat are compared. 

For the words containing /pr/ the 
data plotted in the lower portion of 
Figure 3 show that the lips separate at 
a markedly lower rate than for 
/p/-+ stressed vowel, and the curves 
clearly show an oscillatory component. 
This oscillatory tendency, which is 
most clear when /r/ follows /p/, is 
also seen when /p/ and /b/ precede 
' stressed vowels. The motion of the in- 
ner edges of the lips can apparently be 
analyzed into two components super- 
imposed on each other: one is a regular 
variation that includes the motion of 
the mandible (see infra), and the sec- 
ond is a damped vibration of one or 
both lips. The characteristic period of 
the oscillation is approximately 25 to 30 
msec. In general, the opening character- 
istics of /m/ are relatively smooth and 
have no vibratory components. 

The observations described above in 
connection with Figure 3 were gen- 
erally valid for other utterances con- 


taining different vowels. Measures of 
the initial speeds for all words are given 
in Table 1(a). This table shows the 
values of the midsagittal lip separation 
5 msec after the plosion as well as the 
maximum values that were attained in 
the later part of the vowels. It is clear, 


at least for the utterances observed in 


this experiment, that_the opening at 


the_first_5 msec is significantly larger 
for initial /p/ than for initial 7b or 
/m/. The values for intervocalic /p/ 
are about one-half to one-third of those 

















= 

= 20 —— 400 

g 

o 15 300 = 

2 = 

rT] = 

= 10 200 ; 

a W 
a 

= 5 100 < 

Wl 

= ry 

+ 09°"5 10 15 20 25 


FRAME NUMBER (X 4 = TIME IN MSEC) 


Ficure 4. Plots of various dimensions rele- 
vant to the mouth opening. The curves are 
labelled as follows: (a) horizontal width 
divided by 2 (in mm); (b) _lip-opening 
area (in mm’); (c) vertical distance (in 
mm); (d) distance between the lower teeth 
and the upper lip; (e) position of the lower 
teeth referred to fixed items on the upper 
part of the face. The utterance is pope. The 
area was calculated from the vertical separa- 
tion and the horizontal width of the lip 
opening assuming these to be the two axes 
of an ellipse. The position of the lower teeth 
is referred to fixed items on the face (arbi- 
trary zero point). 


for initial_/p/4,—when_the. consonant is 
not followed by /r/. In similar fashion, 
when /p/ follows /s/ the initial rate of 
separation of the lips is small compared 
to the rate for initial /p/. 

The table also shows that the extent 
of the opening in the central part of the 
stressed vowels seems to be systemati- 
cally different depending on whether 
the consonant is in the initial position 
or is preceded by a schwa. In the for- 
mer case, where the speed of opening is 
more rapid, the opening in the later 
stage reaches a higher value than it 
does in the latter case. 

In order to determine the values given 
in the table from the plotted curves, it 
was necessary to establish with precision 
when the explosion occurred. Since the 
articulatory act is described by a se- 








quence of exposures with a finite inter- 











a ea CO eee 2 ee a a a me ce ee Oe a 


300 


9 





eS eS ES 








Fujimura 


frame time, the last exposure showing 
closed lips does not correspond neces- 
sarily to the exact moment of the ex- 
plosion. (In consequence, the apparent 
difference in the lip opening shown in 
the second frames of the first and 
second rows in Figure 1, for example, 
does not necessarily imply different 
rates of separation for the two utter- 
ances.) The moment of release for var- 
ious words, therefore, was estimated by 
extrapolating curves of vertical lip 
separation to the abscissa. 

neh inde iah eagagm 
Measurements ip Opening in a 
horizontal plane were also made, and 
an example of such measurements is 
given in Figure 4 where data relative to 
the word pope are displayed. In general, 
the horizontal measurements reflect the 


tendencies already seen in the plots of 
the “SeHical mesures” The rate of 
growth in the first 5 to 10 msec in this 
dimension, however, is Considerably 
higher than_that for the vertical dis- 
tance, if the two dimensions are com- 
pared in terms of percentage relative to 
their maximum values in the later stage 
of the vowel. The values which reached 
5 msec after the plosion are listed in 
Table 1(b), together with data for the 
central part of the vowel in the same 
utterances. In the case of initial stop 
followed by stressed vowels, for ex- 
ample, it can be seen from the values 
given in Table 1(b) that about 40 to 
75% of the maximum widths are 
attained within the first 5 msec. The 
corresponding values for the vertical 
distance, as shown in Table 1(a), are 
15 to 30% of the maximum values. 
The tendency is the same for any other 
utterance, and the contrast, is usually 
more marked when the opening proc- 
ess is slower. This characteristic of the 


x 









: Bilabial Stop and Nasal Consonants 239 


horizontal dimension has an important 
bearing on the time dependence of the 
opening area. 

Area of Tip Opening: Approximate 
areas for the lip opening were estab- 
lished by assuming that the horizontal 
and vertical measurements described 
above were the major and minor axes 
of an ellipse. The accuracy of such an 
estimation was checked by means of 
planimeter measurements on a selected 
number of greatly enlarged exposures, 
and it was found that the errors were 
almost always less than + 10%, 
even when there was a marked irregu- 
larity or asymmetry in the shape of the 
opening between the lips. One major 
exception, however, was for the most 
open phases of the production of /o/, 
a lip-rounded vowel, where the calcu- 
lated areas were too low by 10 to 15%. 


Calculated values of the mouth-open- 
ing area as a function of time for the 
word pope are plotted in Figure 4. 
There is a very high rate of growth in 
area during the first 10 msec after the 
consonant explodes. After this rapid 
growth, the function is relatively flat. 
The abrupt change in the speed, caused 
by the vibratory component, is apparent 
in /p/ and /b/, in general, but not in 
/m/. The first part of the curve repre- 
senting the area, however, shows a very 
rapid growth in the case of /m/ as well 
as for stops, because of the particularly 
fast increase of the horizontal dimen- 
sion in the first stage of the lip separa- 
tion as discussed above. 


Supplementary Measurements. Cer- 
tain other data on the production of 
stop consonants besides measurements 
f the lip opening can be derived from 
he films and recordings. These supple- 


“mentary data include measurements of 


. 





240 Journal of Speech and Hearing Research 


the teeth _separation--and—mandible 
movernents, and determinations of the 
onset time of voicing relative to the 
instant at which the mouth closure is 
broken. 


Effect of Dentition. The effective 
area at the open end of the vocal cavity 
is not necessarily defined by the lip 
separation as such. As can be noted in 
Figure 1, the lower teeth of the talker 
were prominently displayed in the 
vowel portions of the words. During 
the process of lip opening, the upper 
teeth were not visible, except in the 
vowel /i/ which is characterized by 
spread lips. 


The constriction contributed by the 
lower teeth was estimated on the front 
views by measuring the vertical dis- 
tance between the inferior edge of the 
upper lip and the superior edge of a 
central incisor on the lower jaw. 
Figure 4 presents an example for the 
utterance pope. 


In that part of the vowel where the 
mouth opening is maximum, the appar- 
ent lip-tooth distance constitutes about 
55% of the lip separation. For /o/ 
and /a/, in general, about one-half of 
the lip opening is obstructed by the 
lower teeth; for /i/ the contribution 
of the teeth to the effective constriction 
is even greater, the lip-tooth distance 
being about 25 to 30% of the lip 
separation. In the latter case (namely 
/i/) the opening during the articulation 
of the vowel is further reduced by the 
intrusion of the upper teeth. The acous- 
tic effect of the teeth, however, is 
usually hot as great as this apparently 





small openi ng might Suggest. The length 


of this constricted portion of the vocal 
tract is very small and, furthermore, 
the upper lip and the lower teeth are not 


in the same vertical plane. For nonclose 
vowels, in general, it is probably quite 
reasonable to consider the labial con- 
striction as the major determinant of 
the mouth opening, and the influence 
of the lower teeth only as a small cor- 
rection term. In the case of /i/, on the 
other hand, the dental constriction more 
reasonably can be regarded as an ex- 
tension of the narrow tube formed by 
the tongue and hard palate, rather than 
as an integral part of the mouth orifice. 

In the early stages of articulatory 
motion, thelipseparation isa good 
measure of the effective mouth~open- 
ing, even when the lower teeth appear 
prominently in the front view of the 
mouth yuth opening. In comparison to the 
very ry small value observed in the front 
view, the actual distance between the 
upper lip and the lower teeth is appre- 
ciably larger, if it is measured in the 
midsagittal plane. In the case of a 
small lip opening, therefore, the-effect 
of the lower teeth is s significant only if 
the interdental distance is small com- 
pared to the interlabial distance. This 
situation. does not occur, however, 
since the imandible is already lowered 
appreviably before the lips begin to 
separate (see infra). 

Mandible Movements. Examination 
of Figure 2 reveals substantial differ- 
ences in profile between the exposure 
just preceding the plosion and that 
about 300 msec (that is, —80 frames) 
before it. The latter exposure probably 
represents the profile of the articulators 
at rest with the upper and lower dental 
-arches in contact. If this assumption is 
valid, the films indicate that the man- 

















lip closure ‘is broken_b 


Estimates of this movement of om 
mandible were also made on the full- 


y 


nox 














Fujimura: 


Taste 2. The length of the unvoiced portion 
(delay of the onset of glottal vibration re- 
ferred to the instant of lip separation) of 
/p/ in various phonetic environments. Times 
are in msec. 











/i/ /o/ /a/ /ru/ 

/p/ 12 35 19 105 
/ap/ 54 52 47 65 
/sp/ 8 26 22 22 








face exposures by referring the lower 
teeth to relatively fixed reference 
points on the upper part of the subject’s 
face. (Spectacle frames proved most 
valuable for this purpose!) An example 
of these data is included in Figure 4. 
An attempt to obtain a simpler (but less 
accurate) index of mandible position 
was also made by estimating the level 
of the inferior outline of the jaw from 
full-face exposures. All these measure- 
ments indicate that ible moves 
smoothly and _relatively slowly, starting 
well ahead of the stop release and con- 
tinuing into ‘the central part of the 


vowel; no increa érn- 


ible at the time of release (explosion). 


Onset_o i Nakao In Figure 3, small 
arrows indicate the approximate times 
at which voicing (glogtal vibrations) 
began in the words. These points in 
time were derived from study of spec- 
trograms of the words recorded during 
the photogrephy. The spectrograms 
provide a visual amplitude-frequency- 
time display of both the words and 
points corresponding to the time 
markers on the film; the time-marking 
pulse train was high-pass filtered before 
it was mixed with the speech signal. 
For all words containing /p/ the 
voicing onset times relative to the 
plosion are shown in Table 2. 














Bilabial Stop and Nasal Consonants 241 


These_data constitute a clear demon- 
stration_of certain generally accepted 
phonologi lish (13, pp. 
xxxi-xxxii). When /p/_is followed by 
/t/, for example, the onset of voicing is 
late in the syllable, that is, /r/ following 
a voiceless stop _is devoiced. Similarly, 
when /p/ is preceded by /s/, the dura- 
tion of the aspirate portion-of the stop 
is reduced ‘relative to that for /p/ in 
syllable initial position. (The anomalous 
/spak/ requires further investigation.) 
The data also demonstrate the markedly 
longer aspiration associated with inter- 
vocalic /p/ relative to initial /p/. 





Discussion 


Articulatory data of the type ob- 
tained in this experiment_are particu- 
larly helpful in interpreting the acous- 
tical events that occur immediately after 
the lips separate during the production 
of bilabial stop and nasal consonants. 
The relations between the acoustic out- 
put and the articulatory configurations 
and excitations can be derived from the 
acoustic theory of speech production 
(1, part II, chap. 7, part III, chap. 10, 
4, 7, chaps. 1.1, 1.2, 2.6). This theory 
views the e production « of speech as the 














excitation of a time-varying articula- 
tory system by one or more sources. 
The acoustical behavior of the system 
is usually described in terms of a set of 
resonant frequencies or formants (2, 3, 
14, chap. 8) and consequently it is of 
particular interest to examine the rela- 
tions between the present articulatory 
data and the vocal-tract resonances or 
formants. 


First Stage. In the first 5 to 10 msec 
after the explosion of the initial stops, 
the very rapid change in lip opening is 
largely responsible for the changes in 





242 Journal of Speech and Hearing Research 


the resonant frequencies of the acoustic 
system. It is possible to estimate the 
amount of shift in the resonant fre- 
quencies that should be contributed by 
this change in articulation. 


First Formant. When the mouth 
opening is small, the lowest resonance 
in the vocal tract can be regarded as 
a Helmholtz resonance, that is, the res- 
onance of a lumped-constant simple- 
tuned system. In such a case, the res- 
onant frequency is proportional to the 
reciprocal of the square root of the 
inductance which corresponds to the 
mass of the air at the orifice, if the 
volume of the cavity behind it is con- 
stant. For a bilabial constriction with 
an effective length / and an area A this 
inductance is approximately propor- 
tional to 1/A. Thus if a fixed length of 
the constriction is assumed, the fre- 
quency of the first formant changes 
approximately in proportion to the 
square root of the area of the opening. 
If the mouth opening area changes 
linearly with time, for example, the 
shift in the resonant frequency is rapid 
at first and then is slower. 


An example of the change in the area 
of the mouth opening is seen in Figure 
4. For the first 5 to 10 msec the change 
in area is particularly rapid. After this 
initial stage the change in area becomes 
much slower, the speed of lip separa- 
tion being comparable to that of the 
movement of the mandible. This reduc- 
tion in the rate of change of the mouth 
opening area causes an even greater re- 
duction in the rate of change of the 
formant frequency because of the non- 
linear relation between area and the 
frequency of the first formant. If it is 
assumed that a rigid wall surrounds the 
vocal tract, complete mouth closure 


should result in a first formant fre- 
quency of zero. After the first 5 msec 
the frequency of the first formant will 
reach a value of approximately 200 to 
400 cps depending on the following 
vowel. Since this time interval corre- 
sponds to one period of a 200-cps wave 
form, the sound output will show no 
recurrence in wave form in the time 
interval immediately after the explo- 
sion. A similar situation can be seen for 
the higher formants also (see infra). 
The first transient portion of 5 or 10 
msec that immediately follows the re- 
lease of the closure, therefore, is better 
described as a short burst, or spike, than 
as a transition. 

Higher Formants. In the case of 
formants higher than the first, the 
simple relation between mouth opening 
area and formant frequency described 
above does not hold. The effect of the 
labial constriction upon the formant 
frequency is now quite dependent on 
the particular articulatory configura- 
tion. Inthe short.time interval immedi- 
ately following t i. 






sion, however, 


the change in the li ing is the pre- 
dominan istic of the articula- 
tion, and cons d mandible 


positions can he assumed in estimating 
theresonant_frequeneies—which._cor- 
respond to the articulatory configura- 
tions during this interval. 

It can be shown theoretically that, 
when the peripheral portion of the 
vocal tract is narrowed, all the resonant 
frequencies must be lowered. There 
are some configurations of the vocal 
tract, however, for which narrowing at 
the lips will produce only very small 
reductions in the frequencies of given 
formants. 

If a uniform tube with a small (vari- 
able) opening at one end is assumed, 








fc 


@; 


WwW 
th 
sn 








Fujimura 


for example, the resonant frequenc 
w; is given by , 


wtan(wr/we) = R(A/1), @ = 1,2,...) (1) 


where k is a constant depending 
on the cross-sectional area of the 
tube and o, is a constant depending 
on the length of the tube. (The effect 
of dissipation is neglected.) For small 
values of o1/., the frequency of the 
first resonance w; is approximately pro- 
portional to (A/I)'/*. For higher res- 
onances, it can be shown that the 
change of « is proportional to A/I 
when the area of the opening is small. 
If 


W; = wWio + Aw; 


where w;. stands for o; when A = 0, 
then the following relation holds for 
small opening conditions: 


Cle, SMe, SO UF —— 26d 5 se) C2y. 


For the second resonance of a uniform 
tube with both ends closed, 2. is ap- 
proximately 27 + 1000 cps, assuming 
the length of the tube to be that of the 
vocal tract of an average male speaker. 
If a vocal-tract configuration with a 
given mouth opening has a first formant 
at 200 cps, equation (2) indicates that 
the second formant associated with the 
configuration would be only 40 cps 
higher than the second resonance of the 
vocal tract measured with the mouth 
closed. The shift in the third’ formant is 
still smaller. In the case of a tongue 
articulation which is anticipating a front 
vowel, the second or third formant may 
correspond to a resonance mode of the 
front cavity. In such a case, the shift 
of the higher formant is generally more 
apparent since the effective length of 
the vocal tract is much shorter than the 
length of the entire tract. 


: Bilabial Stop and Nasal Consonants 


243 


Instead of calculating the resonant 
frequencies for various configurations, 
it is possible to make estimates based on 
the data supplied by Stevens and House 
(18). They have presented contour 
plots showing first and second formant 
frequencies for various idealized articu- 
latory configurations that are described 
by a small number of parameters. 
In the case of /pit/, for example, the 
mouth-opening area A was about 0.3 
cm?, 5 msec after the plosion, Table 
1(c). If it is assumed that the length / of 
the constriction is about 1 cm (that is, 
A/l = 0.3 cm), and that the distance 


, between the point of constriction and 


the glottis is about 12 cm, the Stevens 
and House data (compare their Figure 
6) indicate that the first formant is at 
about 230 cps and the second formant 
is at about 1800 cps. With the same 
vocal-tract configuration and complete 
closure of the lips, the value of the 
second formant would fall to about 
1 350 cps, demonstrating a shift of about 
450 cycles in the second formant dur- 
ing the first 5 msec after the plosion. 
In the same fashion the second formant 
for the central portion of the vowel /i/ 
would be estimated at about 2 200 to 
2 300 cps, a reasonable approximation 
to the formant frequency actually ob- 
served in the spectrogram of the utter- 
ance. 


Similar estimates from the Stevens 
and House data can be made for tongue 
configurations appropriate to back 
vowels. If the point of constriction is 
about 5 cm from the glottis and A/I is 
smaller than 0.5 cm, the second formant 
is substantially constant in frequency 
somewhere between 800 to 900 cps 
regardless of the mouth opening, and 
the first formant can rise to as high as 





244 Journal of Speech and Hearing Research 


500 cps. The shift of the second form- 
ant in a bilabial consonant + back 
vowel syllable, therefore, may be hardly 
detectable in some cases (see infra). 


Later Stage, Transition. After the 
first stage, which usually occupies a 
time interval of one glottal period or 
less immediately following the separa- 
tion of the lips, there is a stage in the 
production of stops in which the mo- 
tion of the lips is relatively slow and 
comparable to that of other articulators, 
for example, the mandible, or the mass 
of the tongue, presumably. In this 
stage, which can be called a ‘transition’ 
implying a quasi-static change, the 
frequencies of the second and higher 
formants do not necessarily rise as the 
lip opening increases, because the con- 
current change in other parts of the 
vocal tract may be principally respon- 
sible for the shift of the resonances. In 
fact, for initial bilabial stops followed 
by back vowels, it is not unusual to see 
no upward shift of the second and/or 
third formant bars in the vicinity of 





mer i 











FREQUENCY (KCPS) 
O-—- nNW tO Oo 





AN 
0 0.2 0.4 06 
TIME (SEC) 


Ficure 5. Sonagram of an utterance boil, by 
a male speaker of General American English. 
The second formant shifts downward im- 
mediately after the release of the stop, and 
then moves upward. 





the spike (8, 11, 14, chap. 8). The 
abrupt upward shift of the resonance 
frequencies in the first short time in- 
terval is not generally observable in a 
spectrogram, and the starting points 
of the formant bars do not represent 
the resonant frequencies of the vocal 
system with complete articulatory 
closure. A downward shift may even 
be observed, depending on the phonetic 
environment and also on the particular 
language (8). An example of such a 
situation is illustrated in Figure 5, where 
a sound spectrogram of the word boil 
uttered by a speaker of General Amer- 
ican English is shown. 

In the transition to the central part 
of the following vowel, the first form- 
ant, on the other hand, will generally 
show an upward shift since the lowest 
resonance is not very sensitive to a 
change in the location of the tongue 
constriction. (This rising first formant 
is often obscured by the lack of low- 
frequency resolution of the conven- 
tional spectrograph.) In view of the 
appreciable difference in the speed of 
lip separation observed in the very 
early stage, however, initial tense and 
lax stops may indicate apparently dif- 
ferent starting points of the first form- 
ant bar. This may partially explain 
the experimental results reported pre- 
viously concerning the effect of the 
first formant transition on identification 
of initial tense and lax stops (2, 5), 
although presumably there are many 
other factors which are involved in 
this problem (9). 

In general the articulation of stop 
consonants shows a third stage where 
the lip separation is again relatively 
rapid, although not as rapid as in the 
initial stage. The rate of the formant 
shift will be markedly lower in this 








c/o — eX Fy meee - 6 i A he 


Fy |ex,>rnrt re weer 


—-—_ =-— FRO i. 85 —e 


ee ee ee ee 








Fujimura: 


period than that in the first portion 
of the articulation because of the non- 
linear relation between the opening 
area and the formant frequencies, and 
the effect of the lower teeth. 


Intervocalic Stops. In the case of 
intervocalic stops, the particularly rapid~ 
change in the first 5 to 10 msec is 


generally found but the absolute speed | 


is not as great as that for initial /p/ 
The distinction of the first transient 
portion from the subsequent transition 
is still valid, although the initial shift 
of the formant frequencies is not as 
great. The formant bars in spectro- 
grams consequently may show a clearer 
approach toward the ‘target’ points. 
The initial rate of lip separation fdr 
initial /b/ is generally somewhere be- 
tween that for initial /p/ and that for 
intervocalic /p/ or /b/. 


Bilabial Nasal. In the case of /m/, 
the initial Jip movement-is—usually not 
as fast as that for initial stops (see Table 
1). The speed in the second stage, ho 
ever, is considerably greater se that 
for stops, because the vibratory coms 
ponent is not present (see Figure 3). 
The behavior of the formants,. there- 
fore, may indicate the presence of a 
target or locus, although the formant 
bars will be shifting rapidly and will 
be smeared. When /m/ is followed by 
a front vowel, however, the higher 
formants may exhibit a very rapid and 
unidentifiable shift near the plosion 
(12) because the resonant frequencies 
are more sensitive to the change in the 
lip opening for this vocal-tract con- 
figuration. 

For initial and intervocalic /m/, the 
average rate of change of area durin 
the first 20 msec, including the firs 
transient portion, is typically equal t 


Bilabial Stop and Nasal Consonants 245 


or slightly greater than that for initial 
/p/. In both /mit/ and /amit/, the 
amount of opening at the end of this 
interval was 160 mm?. Assuming 10 
mm for /, an estimation from the 
Stevens-House contours gives a shift 
of the second formant from 1 400 cps 
| to 2 100 cps within this 20-msec inter- 
val. 


Spike, Frication and Aspiration. It 
has been shown that in the first transient 
portion of the acoustic event the change 
in resonant frequencies is so rapid that 
this phase cannot be called a ‘transi- 
tion.’ It is worth pointing out, however, 
that this time interval may carry 
sufficient cues for identification of stop 
consonants, even when no noise excita- 
tion is present. Synthesis experiments 
using both terminal and topographic 
electrical analogs of the vocal tract 
have shown that a shift of the resonant 
frequencies that occurs entirely within 
the first 10 to 20 msec can result in 
identifiable voiced stop-vowel syllables 
(10). 

The fact that the production of an 
initial bilabial stop consonant (when 
followed by a vowel) results in an 
appreciable mouth opening very shortly 
after the separation of the lips suggests 
that the noise characteristic of such an 
articulation is not generated at the 
lip constriction, except at the very 
beginning. It is reasonable, therefore, 
to describe the acoustic structure of a 
/p/ + vowel syllable as consisting of 
three successive phases—a spike, an 
aspiration, and a voiced vocalic phase. 
The transition that serves as a cue for 
the consonant identification may be 
contained in the second and/or the 
third phase. The spike is characterized 
by a rapid change in the articulatory 





246 Journal of Speech and Hearing Research 


system, that is, in the transmission 
characteristics, with the participation 
of a sound source located at the con- 
striction. 


Concluding Remarks 


More data are required before firm 
quantitative conclusions can be drawn, 
since some fluctuation must be expected 
from speaker to speaker and from utter- 
ance to utterance with a given speaker. 
The preliminary data reported here, 
however, represent reasonably typical 
examples of natural utterances. When 
combined with theoretical considera- 
tions, the data also have provided rea- 
sonable physical accounts of the 
essential characteristics of the acoustic 
output. A series of experiments in- 
volving more speakers and a greater 
number of consonants and replications 
is in progress and will be reported at 
a later date. 


Summary 


The movements of the lips when 
bilabial stops and nasals are produced 
in various phonetic environments have 
been photographed for study by means 
of a stroboscopic technique. It has 
been found that immediately following 
the plosion the rate of increase in the 
lip opening is relatively high. A sig- 
nificant portion of the change of the 
resonant frequencies of the vocal tract, 
in consequence, takes place in 5 or 10 
msec, and in this brief interval the 
acoustic ougput cannot be characterized 
by transitions of formants that are de- 
scribed in a quasi-static manner. The 
effect of the environment of the con- 
sonant upon the initial speed of the lip 
opening process is considerable; the 
movement is particularly rapid when 


a tense bilabial stop consonant is in 
word-initial position. There is a sig- 
nificant difference in the physical 
mechanism of the motion of the lips 
during the production of the nasal 
bilabial, compared to that for the stops, 
because of the overpressure built up 
behind the closure in the case of stops. 
The relation of the articulatory data 
to the acoustic structure of the output 
speech signals has been discussed in 
some detail. 


Acknowledgment 


The generous cooperation of Pro- 
fessor H. E. Edgerton in providing 
stroboscopic facilities is gratefully 
acknowledged, as well as the technical 
assistance of Mr. P. Yamin of the 
Stroboscopic Measurement Laboratory. 
The author also wishes to express his 
thanks to Dr. Arthur S. House for his 
valuable advice and cooperation as the 
subject, and to Professor Morris Halle 
and Professor Kenneth N. Stevens for 
their stimulating discussions and encour- 
agement. Thanks are due also to Dr. 
George Rosen for his suggestions on 
the instrumental techniques, and to 
Miss Jane Arnold for her effective work 
in the data processing. 


References 


1. Cuma, T., and Kajryama, M., The 
Vowel, Its Nature and Structure. Tokyo: 
Tokyo-Kaiseikan, 1941. 

2. Cooper, F. S., Detatrre, P. C., LiserMAN, 
A. M., Borst, J. M., and Gersrmay, L. J., 
Some experiments on the perception of 
synthetic speech sounds. J. acoust. Soc. 
Amer., 24, 1952, 597-606. 

3. Devatrre, P. C., Lrserman, A. M., and 
Cooper, F. S., Acoustic loci and transi- 
tional cues for consonants. J. acoust. 
Soc. Amer., 27, 1955, 769-773. 








-—=4+4 pp pew mw 


Qn on LI) sS 








Fujimura: Bilabial Stop and Nasal Consonants 247 


4. Dunn, H. K., The calculation of vowel 
resonances, and an electrical vocal tract. 
J. acoust. Soc. Amer., 22, 1950, 740-753. 

5. Duranp, Marcuerire, De la perception 
des consonnes occlusives questions de 
sonorité. Word, 12, 1956, 15-34. 

6. Epcerton, H. E., Electronic high-speed 
motion pictures. Electron. Equip. Engng, 
6, 1958, 43-48. 

7. Fant, G., Acoustic Theory of Speech 
Production. The Hague: Mouton, 1960. 

8. FiscHEer-JorGENSEN, Ex1, Acoustic analysis 
of stop consonants. Misc. Phonet., Il, 
1954, 42-59. 

9. FiscHER-JoRGENSEN, Exit, What can the 
new techniques of acoustic phonetics con- 
tribute to linguistics? In Proc. VIII int. 
Cong. Linguists. Oslo: Oslo Univ. Press, 
1958. 

10. Fujimura, O., Some synthesis experi- 
ments on stop consonants in the initial 
position. Quart. Prog. Rep. 61, April 15, 
1961, Res. Lab. Electronics, Mass. Inst. 
Technology. 

11. Hattie, M., Hucues, G. W., and Ran tey, 
J. P. A., Acoustic properties of stop 
consonants. J. acoust. Soc. Amer., 29, 
1957, 107-116. 


Deafness and Every-Day Living 


®& President Leonard M. Elstad announces 
that Gallaudet College this year will begin 
a new research project to study how deafness 
affects the pattern of every-day living in the 
metropolitan area of Washington, D. C., and, 
in turn, to determine how this pattern is 
related to vocational and social adjustment. 
The project is supported by a grant to the 
college by the United States Office of Vo- 
cational Rehabilitation. 

The specific purpose of the study is to 


12. Hatrort, S.. Yamamoto, K., and Fu- 
jimura, O., Nasalization of vowels in 
relation to nasals. J. acoust. Soc. Amer., 
30, 1958, 267-274. 

13. Kenyon, J. S., A guide to pronuncia- 
tion. In Webster’s New International Dic- 
tionary of the English Language. (2nd 
ed.) Springfield: Merriam, 1952. 

14. Porrer, R. K., Kopp, G. A., and Green, 
Harrier C., Visible Speech. New York: 
Van Nostrand, 1947. 

15. Rosen, G., Dynamic analog speech syn- 
thesizer. Tech. Rep. 353, 1960, Res. Lab. 
Electronics, Mass. Inst. Technology. 

16. Smita, G. A., A motion picture study 
comparing lip and jaw movement and 
area of mouth opening of nasal and non- 
nasal speakers. M.A. thesis, Univ. Iowa, 
1950. 

17. Srerson, R. H., Motor Phonetics: A Study 
of Speech Movements in Action. (2nd 
ed.) Amsterdam: North-Holland, 1951, 
for Oberlin Coll. 

18. Srevens, K. N., and Houssr, A. S., Studies 
of formant transitions using a vocal tract 
analog. J. acoust. Soc. Amer., 28, 1956, 
578-585. 


B RESEARCH NEWS NOTE 


report on the number and vital statistics of 
deaf persons in the area: family composition 
and structure; occupations; population move- 
ment; relations between the deaf and the 
hearing; social participation and interest in 
community affairs; and needs for counseling 
and other vocational rehabilitation services. 


Jerome D. Schein, Ph.D. 

Project Director and Head 

Office of Psycho-Educational Research 
Gallaudet College 

Washington, D.C. 








Intelligibility of Slow-Played Speech 


WILLIAM R. TIFFANY 


DELMOND N. BENNETT 


It has been frequently demonstrated 
that improvement of the intelligibility 
of speech for listeners with extreme 
losses or distortions in the high fre- 
quencies cannot be accomplished solely 
by ‘boosting the highs.’ Indeed, alto- 
gether too often intelligibility cannot 
be improved by any simple means of 
amplification. It is well known also that 
in perceptive hearing losses the receptor 
mechanism governing the critical high 
frequency areas may be damaged be- 
yond recovery or repair. 

The clear implication of these facts 
is that improvement of intelligibility in 
the case of marked end-organ or nerve 
damage must be accomplished through 
the more effective use of remaining 
receptor units and that any attempt to 
force damaged or missing units to 
respond through amplification will fail. 
It appears difficult, if not impossible, 
to widen a narrow pass band in an 
abnormal ear merely through selective 
amplification. 

If the pass band of a damaged ear 
cannot be widened to accommodate 
the fairly extensive band width occu- 





William R. Tiffany (Ph.D., University of 
Towa, 1951) is Associate Professor of Speech, 
University of Washington. Delmond N. Ben- 
nett (M.A., University of Washington, 1958), 
Predoctoral Associate at University of Wash- 
ington at the time this study was made, is 
Director, Speech and Hearing Clinic, North 
Dakota State University. 


Volume 4, No. 3 


248 


pied by the important information in 
the speech signal, another approach 
may be possible, an approach which has 
received little attention in the hearing 
literature to date. This approach recog- 
nizes the fact that in most instances of 
end-organ damage a relatively normal 
sensitivity to low frequencies may be re- 
tained. If a limited, low frequency band 
of normal reception is retained it may 
be possible to compress or otherwise 
alter the information contained in the 
speech signal so that the limited ear 
may encompass it. 


One of the most significant of the 
factors which are important to speech 
recognition is the pattern of energy 
distribution of the vowel sounds de- 
scribed by the first two formants. It is 
known also, however, that the formant 
frequencies are not as critical as are 
the formant frequency relationships or 
patterns. These patterns usually occupy 
the area between about 300 and 3 000 
cycles, but there is reason to believe 
that the frequency area occupied by 
the formants is of secondary considera- 
tion. Rather than the formant frequen- 
cies per se, the formant ratio is 
important in determining the vowel, 
and the frequencies may vary greatly 
as long as the ratio remains constant. 
This fact would seem to indicate the 
way in which a compression of speech 
may be accomplished in order that it 


September 1961 











Tiffany, Bennett: Intelligibility, Slow-Played Speech 249 


may be fitted to the limited capacities 
of a damaged ear. 

If it is the relative, rather than the 
absolute, formant positions of the vowel 
sounds which determine their intelligi- 
bility then it should be possible to shift 
the overtones of the speech signal 
downward on the frequency scale by 
proportional amounts, retaining relative 
formant position and _ intelligibility 
while bringing the absolute frequencies 
of the speech signal within the capabili- 
ties of damaged ears. 

Except for band elimination experi- 
ments, investigations of the distortions 
imposed upon speech by various meth- 
ods of frequency change have been few. 
Several, however, have been concerned 
with some kind of frequency change, 
or ‘shift,’ which may be cited to answer 
questions concerning the effect on in- 
telligibility of proportionate downward 
shifts of the speech frequency spectrum. 

A ‘frequency shift’ experiment has 
been reported by Fletcher (3), who 
investigated the intelligibility of ‘stand- 
ard articulation lists’ at speeds of 
rotation of a phonograph turntable de- 
creasing to as much as one-half of 
normal. The articulation scores re- 
ported indicate that a speed of about 
.6 of normal produced a score of about 
41%. 

Investigations, following Fletcher’s, 
which have studied the effect on intel- 
ligibility of frequency shifts include 
those of Engelhardt and Gehrcke (1), 
Ochiai and his colleagues (5, 6, 7, 8, 9), 
and Kurtzrock (4). Among these 
studies probably the most thorough 
explorations have been those of Ochiai 
and others at Nagoya University. 
Ochiai has used what he termed ‘rota- 
tional synchronous distortion’ to in- 
vestigate the effects of frequency shift 


upon the articulation’of Japanese speech 
samples. Slow-play ratios of .33, .50, 
and .75 were utilized with three talkers 
(adult male, adult female, and child) 
and two ‘modes’ (whispered and 
phonated) for 100 CV syllables. The 
results indicate an articulation for half- 
speed phonated adult male speech of 
about 70%. Considerably better articu- 
lation was found for the speech of 
adult females and children than for 
that of adult males, considerably poorer 
for whispered than for phonated speech, 
and considerably better for vowels than 
for consonants or syllables. 


In a recent study, Kurtzrock (#4) 
investigated several aspects of time and 
frequency distortion. Among other 
things he experimented with slowed 
speech (which he terms a ‘time-fre- 
quency’ distortion condition) as well 
as frequency expansion using the time 
compression-expansion device de- 
veloped by Fairbanks, Everitt, and Jaeg- 
er (2). Kurtzrock’s intelligibility scores, 
obtained for one male speaker, highly 
trained listeners, CVC words, and for 
slow-play factors of 2.00, 2.83, and 4.00, 
were as follows: 43%, 22.7%, and 
14.8%, respectively. For straight fre- 
quency division by factors of .50, .35, 
and .25, scores were 32.8%, 16.5%, and 
13.8%, respectively. Among the con- 
clusions based upon these and other 
data were: (a) frequency change is of 
more importance than time change alone 
in determining the intelligibility of 
slowed speech, and (b) in general the 
vowels are more affected than the 
consonants by frequency distortion. 
Kurtzrock states also (p. 26) that ‘these 
data emphasize the view that the abso- 
lute locations of component frequencies 
are important, inasmuch as the relative 








250 Journal of Speech and Hearing Research 


locations were preserved in the experi- 
ment.’ 


From the results of these experiments 
it would appear that any attempt to 
accommodate speech to an injured ear 
by compressing its frequency range 
through frequency division or ‘slow- 
play’ might well result in grossly dis- 
torted speech and low intelligibility. 
However, the data adduced thus far 
leave several points unclear. In the first 
place, there is the discrepancy between 
the figures of Ochiai on the one hand 
and those of Fletcher and Kurtzrock 
on the other. Presumably this discrepan- 
cy results from the different linguistic 
systems studied, perhaps from the fact 
that the Japanese vowel system contains 
fewer choices than does the English. 
In the second place, there is the obvi- 
ously extremely important source of 
speaker variation revealed by Ochiai in 
his male-fernale comparisons and not 
investigated by Fletcher or Kurtzrock. 


It may be particularly noteworthy 
that even though absolute frequency 
locations are ‘important’ this does not 
necessarily imply an importance which 
is permanent or which is imposed by 
the structure of the ear or the nature 
of sound. It seems more likely that what 
is intelligible is based upon normal 
experience of the ranges within which 
speech frequencies usually vary. People 
are accustomed to listening to speech 
ranging from that produced by small 
children to that of adult males. Thus 
they have no difficulty recognizing 
formants which vary within the limits 
imposed by a fairly wide range in the 
size of normal human resonators. Be- 
yond these frequency limitations, listen- 
ers lack the listening experiences 
necessary for adequate and correct 


perception of vowels. What is un- 
known, and what is of prime impor- 
tance here, is the degree to which a 
listener can learn to identify vowel 
resonances which lie outside the normal 
range of experience, particularly those 
which are lower than normal. The cru- 
cial question concerning the intelligi- 
bility of slowed speech is not how 
intelligible it is, but how intelligible it 
can be. 

The experiments reported in this 
paper have been designed to yield in- 
formation on two of the points dis- 
cussed above. In Experiment I the 
question is concerned with the relation- 
ship between intelligibility and amount 
of frequency shift in four different 
speakers (two male, two female) speak- 
ing American vowels. In Experiment II 
the question is concerned with the 
extent to which the decrease in intel- 
ligibility caused by extreme changes in 
frequency can be overcome through 
learning. 

In these experiments, the first in a 
series, only normal listeners were used. 
Experiment III, now under way, will 
apply the information gained to the 
hard-of-hearing case. The ultimate and 
distant hope is that, by a relatively un- 
complicated means, some light can be 
shed on the practicability of submit- 
ting speech to certain controlled fre- 
quency shift distortions which, though 
they may result in loss of intelligibility 
for normal listeners, may through train- 
ing result in improved reception in the 
case of the severely hard of hearing. 


Experiment I 


The first experiment is concerned 
with the influence upon intelligibility of 
proportional frequency lowering which 
is accomplished by the simple expedient 











Tiffany, Bennett 


of making a tape recording of speech 
and playing back that tape recording 
at tape speeds slower than those used 
in the original recording. 

The slow-play method of frequency 
lowering has at least two advantages: 
(a) it is simple and available to nearly 
everyone, and (b) it could be applied 
as an actual teaching or corrective de- 
vice should this eventually prove desir- 
able. Aside from the fact that the 
slow-play method of frequency shift 
results in an inseparable combination of 
both time and frequency distortion, the 
main objection to this method of fre- 
quency change is that it is not instan- 
taneous and requires recording and 
subsequent playback. The first of these 
objections, which has to do with the 
time distortion, may not be as important 
as it seems. At least Kurtzrock (4) 
found little loss of intelligibility result- 
ing from time expansion alone. The 
second drawback, the necessity for re- 
cording and playback, will have to be 
surmounted before frequency distor- 
tion can be practical in conversational 
discourse but will not interfere with the 
use of the slow-play method as an ex- 
perimental technique. 

The question of what constitutes an 
appropriate intelligibility test has not 
been answered to everyone’s satisfac- 
tion. It may involve the recognition of 


large samples of speech, randomly. 


chosen or weighted phonetically ac- 
cording to some controlled sound-fre- 
quency count. It also may involve 
choosing one or more of the many 


possible kinds of phonetic elements and’ 


being content with limiting conclusions 
to those particular elements. In the 
present study only vowel intelligibility 
has been investigated. Vowels were 
selected for study partly for experi- 


: Intelligibility, Slow-Played Speech 251 


mental simplicity and partly because 
recognition of the vowel is important 
to speech recognition in general. Again, 
there is reason to suspect, on the basis 
of Kurtzrock’s findings, that slow-play 
will have its influence primarily upon 
the vowel. 


Method. The method of Experiment 
I involved the presentation to normal 
listeners of word lists which were made 
up of repetitions of 10 different vowels 
in a constant [hd] context. These 
lists were recorded by four speakers 
and played back at the same and at 
successively lower playback speeds to 
groups of untrained listeners. The re- 
sponses of these listeners were then 
scored to obtain intelligibility scores 
for up to seven different degrees of 
speed-frequency alteration (to as low 
as 45% of normal) for two male and 
two female speakers. 


Vowel Formants. An important vari- 
able is that of the resonances, or form- 
ant frequencies, of the speakers’ vowels. 
For example, it would seem reasonable, 
in view of Ochiai’s results, to expect 
that the higher formants possessed by 
women would make their speech intel- 
ligibility more resistant to the distortion 
of downward frequency shift than 
would be the case for the speech intel- 
ligibility of men. Thus, both male and 
female speakers were used in this study. 
Two male speakers were deliberately 
chosen as having ‘resonant,’ low pitched 
voices; speaker M-1 was a speech 
teacher, speaker M-2 was a highly 
skilled baritone singer. On the other 
hand the female speakers (F-1, F-2) 
were chosen for their very ‘feminine 
sounding’ voices. All four were trained 
speakers and all spoke well articulated, 
General American speech, 





252 Journal of Speech and Hearing Research 


Taste 1. Fundamental and formant frequency measures for four sample vowels for two male 
(M-1, M-2) and two female (F-1, F-2) speakers. 











Measures M-1 M-2 F-1 F-2 
Fundamental Frequency* 116 110 210 205 
[i] Formant 1 275 250 500 500 
Formant 2 2 700 2 250 3 000 3 000 
[ze] Formant 1 700 650 1000 950 
Formant 2 1950 1700 2.000 2 000 
[a] Formant 1 650 700 900 900 
Formant 2 1 100 1100 1 400 1 400 
{u] Formant 1 300 300 400 500 
Formant 2 750 650 1200 1150 








“Averaged for all four vowels. 


Sound spectrographic measurements 
of the fundamental frequencies and the 
first two formant frequencies for four 
sample vowels are presented in Table 
1. It can be seen that the male measures 
indicate frequencies well within the 
normal range or lower, with speaker 
M-2 showing, for the most part, lower 
resonances than M-1. The female voices 
are seen to be highly similar and de- 
cidedly higher in both formant and 
fundamental frequencies when com- 
pared with the male. 

Test Word Lists. Each of the 
speakers recorded a 50-word test list 
which consisted of five repetitions of 
each of 10 different vowels in an 
[hd] context: [hid, h1d, hed, heed, 
had, hsd, had, hod, hud, hud]. In 
addition a 20-word practice list consist- 
ing of two repetitions of each of the 
different syllables was recorded. All 
orders were randomized and a carrier 
phrase preceded each word. 

Master recordings were made of the 
four sets of experimental and practice 
readings; eight different working tapes 
were dubbed from them at eight dif- 
ferent tape speeds. Employment of 


successively smaller capstan drives 
with the playback recorder gave speeds 
approximately as follows: 15 ips 
(normal), 13.1 ips, 11.8 ips, 10.3 ips, 
9.1 ips, 7.5 ips, and 6.5 ips. For 
speakers M-1 and. F-1 an eighth con- 
dition of playback was dubbed at a 
speed of about 5.9 ips. Thus, the re- 
ductions in frequency of successive 
dubbings approximate 10% for each 
successive tape except for the last two 
which approximate 5%. 

Listeners. There were 15 groups of 
four listeners each. All were normal 
hearing, college undergraduate speech 
students naive in terms of phonetics and 
the experiment. They were told that 
they would listen to word lists read by 
a male and a female reader, and that on 
hearing a word they were to respond 
by writing one of the following words: 
heed, hid, head, had, hud, heard, hoed, 
hood, or who'd. Eight different groups 
heard speakers M-1 and F-1, in that 
order (male before female), at the 
eight different speeds, one speed per 
group. A separate series of seven groups 
of four listeners each heard speakers 








Qu =] 


[ I 


mo Ar HA DH &@t OD HF FH OD 








Tiffany, Bennett: Intelligibility, Slow-Played Speech 253 


Taste 2. Intelligibility (per cent correct) scores of four listeners for the eight different slow- 
down rates of test tapes recorded by four speakers. 











Tape Rates Speakers Mean Speakers Mean 

(% Normal) M-1 M-2 - - 
100 100 100 100 99 100 100 
87 96 100 98 95 100 98 
79 99 96 98 94 96 95 
69 96 94 95 97 100 98 
61 78 73 76 90 96 93 
50 32 44 38 80 70 75 
43 14 23 18 63 56 59 
39 14 42 








F-2 and M-2 in that order (female be- 
fore male), at seven different speeds. 

Equipment. Taped word lists were 
recorded in a quiet room on a Presto 
RC 10/24 recorder at 15 ips. Experi- 
mental tapes were made by dubbing 
from a Presto RC-7, for which special 
capstans of various sizes had been made, 
to an Ampex 600. The intelligibility 
measures were made by groups of four 
listeners simultaneously listening to 
playbacks over the Presto RC 10/24, 
through earphones, in a sound isolated 
listening room, at approximately 70 db 
above threshold. 

Results. The results of the first ex- 
periment are presented in Table 2. This 
table presents the average intelligibility 
scores based upon the responses of four 
listeners to five repetitions each of 10 
different vowels as spoken by two male 
and two female speakers. Because of the 
fact that sex and sequence effects are 
confounded, and because few data are 
available for the evaluation of listener 
variation, no statistical analysis is pre- 
sented. The differences in intelligibility 
obtaining between speakers of the same 
sex (Table 2) are not great enough in 
comparison to the listener variation to 
allow for any well defined conclusions. 


The obtained trends in intelligibility 
scores as a function of frequency 
change, however, are obvious. These 
results appear to be in very good agree- 
ment with the findings of other re- 
searchers employing English speech 
samples. For example, the recognition 
score (at one-half speed) of 38% com- 
pares with the 32.8% found by Kurtz- 
rock for a frequency shift of the same 
amount. The fact that the articulation- 
versus-frequency curve appears to reach 
an asymptote at about 43% of normal 
frequency is likely a function of the 
number of choices available, as well as 
of the tendency toward high artic- 
ulation scores for a few of the vowels 
over all of the conditions studied. 

The sex differences are so great and 
consistent as to leave little doubt as to 
their interpretation. The results are in 
accord with the findings of Ochiai on 
this point and show a tendency for 
intelligibility scores of the female 
speech to be much less affected by 
slow-play distortion. It would seem that 
it is not frequency change per se which 
is a critical factor, but the degree to 
which the formant positions remain 
within the realm of the listener’s com- 
mon experience of human speech. 





254 Journal of Speech and Hearing Research 


Hence, compared to male speech, fe- 
male speech can apparently sustain a 
considerably greater frequency lower- 
ing before it moves outside the range of 
common experience of vowel reso- 
nances for the simple reason that it 
was higher to start with. 

Some consideration was given to the 
types of errors made as a function of 
the particular vowel heard. Errors were 
scattered, and wrong choices were pre- 
dictable only to a limited extent on the 
basis of known shifts of vowel formant 
positions. Errors on female and male 
speech were highly similar in type. For 
example, the four sounds accounting 
for the greatest number of errors were 
the same for both men and women. 
They were the [e, x, a, x]. At 61% 
of normal frequency these sounds ac- 
counted for 85% of the errors made on 
male speech. However, the greater the 
number of errors, the more scattered 
the error types tended to become. 


Experiment II 


It was suggested above that frequency 
shift of the slow-play type might be 
utilized to improve the perception of 
important speech information by a 
damaged ear. Although it is obvious 
from the foregoing results that the in- 
telligibility of male speech is markedly 
disturbed by slow playing, even for 
normal ears, this does not mean that 
frequency shift could not be employed 
with the hard of hearing. Possibly a 
listener could adapt to such a source of 
distortion. Such adaptation would, of 
course, be necessary if frequency com- 
mutation, beyond very modest amounts, 
is to be of any practical use as a ‘cor- 
rective distortion.’ Experiment II, then, 
is designed to discover whether with 
unsophisticated normal-hearing listeners 


a significant amount of listener adapta- 
tion to the distortion can take place in 
a short series of trials. 


Method. This experiment involved 
the administration of a series of intel- 
ligibility tests, similar to the tests de- 
scribed in Experiment I, to 10 control 
listeners and to 10 experimental lis- 
teners. The purpose was to reveal the 
effects of learning, that is, listener ad- 
aptation to downward frequency shift 
distortion. 


Speakers and Tests. The single 
speaker in this experiment was Speaker 
M-1 of Experiment I. The vowels were 
presented in the same [hd] context, 
at the same frequency, and at about the 
same intensity as in that experiment. 
There were, however, several differ- 
ences in the test. The diphthongs [91, 
au, ai, ju, er] were added, the carrier 
phrase was not used, and only the single 
half-speed slow-down rate was investi- 
gated. Instead of a random presentation, 
four different sequences of presentation 
were contrived in an attempt to dis- 
cover whether recognition or learning 
would be facilitated by certain orders 
of presentation. 


In Sequence One, always first, the 
progression was [i, 1, €, ©, 2, G, A, 3, 0, 
U, U, 91, a1, aU, ju]. This represented 
an approximation to an ‘adjacent order’ 
with respect to vowel diagram location. 
With one or two exceptions the adja- 
cent vowels are those which might be 
expected to be easily confused. Se- 
quence Two also was devised to pre- 
serve some (lesser) degree of lawfulness 
in the groups of front and back, high 
and low, and diphthongal vowels. The 
progression was [u, 3, U, 0, A, a, 2, 
€, €, I, 1, at, 1, ju, av]. Sequences Three 
and Four were essentially random, ex- 








= 1] 2 J 


1 


~ 








Tiffany, Bennett. 


cept that an effort was made to have 
adjacent vowels come from opposite 
sides of the vowel diagram. Sequence 
Three was [i, u, 3, au, A, I, ju, 4, e, 
U, aI, 0, w, o1, €] and Sequence Four 
was [€, 0, €, ju, G, i, 3, 31, U, I, aU, x, 
A, at, U]. 

Listeners. The 60-item test tapes were 
played to an experimental and a con- 
trol group of undergraduate speech stu- 
dents, 10 students in each group, over 
high quality playback equipment (Am- 
pex 600 with Ampex 612 speaker- 
amplifier complement) at a_ listening 
level of about 70 db above spondee 
threshold. Listening was done in a large 
classroom. Each group of listeners 
heard the same complete 60-item list 
four times, once each day for four 
successive meetings of a Tuesday- 
Thursday class, with instructions to lis- 
ten and to write one of the following 
words for each item heard: hod, hide, 
how’d, heed, Hugh’d, had, heard, hoed, 
head, hayed, hid, who'd, hood, Hud, 
hoyd. 


: Intelligibility, Slow-Played Speech 255. 


Before listening to the test tape both 
groups of listeners were given a written 
script and told to follow it carefully 
while they listened to a tape recording 
of what was on that script inasmuch as 
the word lists to follow would other- 
wise be hard to understand. The script 
and recording contained sentences in 
which each of the words of the test 
was employed twice, and in which the 
vowel or diphthong under test was em- 
ployed three times. For example: “The 
word heed, as in “please try to heed 
these directions.” The word hid, as in 
“Tom hid out of sight by lying in the 
ditch.” ’ The three critical words were 
all spoken with considerable emphasis. 

The control group listened to the 
practice sentences played at normal 
speed. The experimental group listened 
to the practice sentences at the same 
one-half rate which was employed in 
the test tapes. It was hypothesized that 
such training would result in relatively 
greater improvement for the experi- 
mental group. The control group 


Taste 3. Intelligibility (per cent correct) scores for four successive presentations of half-speed 
speech to experimental (learning) and control (no learning) groups. 











Presentation Day of Testing Mean 
1 2 a 4 

Experimental Group 
Sequence 1 17 32 39 41 32 
Sequence 2 27 30 37 43 34 
Sequence 3 21 30 37 41 32 
Sequence 4 19 32 35 47 34 
Mean 21 31 37 43 

Control Group 
Sequence 1 14 19 17 25 19 
Sequence 2 29 20 29 27 26 
Sequence 3 a 26 25 21 23 
Sequence 4 20 21 20 25 21 
Mean 21 21 23 24 











. 256 Journal of Speech and Hearing Research 


wouid, nevertheless, receive the same 
treatment for all but the critical-fre- 
quency-shift aspect of the learning 
situation. Some learning was, of course, 
expected of those in the control group. 
They did listen, and respond, as did 
those in the experimental group, to 
the same test tape on four occasions 
(successive meetings of a Tuesday- 
Thursday class). They were never able, 
however, to check on the results of 
their identifications. Those in the ex- 
perimental group, on the other hand, 
received a feedback of the results of 
their identifications in the practice lis- 
tening situation where the reading 
scripts were employed with the slow 
speech. 


Results. The results of the listening 
tests described above are presented in 
Table 3 and appear to show beyond 
much doubt the value of the experi- 
mental training. This table gives the 
basic recognition data of Experiment II 
in terms of per cent correct articulation. 
The data are presented separately for 
each of the four days, four sequences 
and two conditions. 


Several conclusions may be drawn 
from the matrix of scores of Table 3. 
For the experimental group the day-to- 
day increase in number of correct re- 
sponses is marked and, with minor 
reversals, similar for all sequences of 
presentation. The average improvement 
over the first day’s score shown by the 
experimental group was in each case 
statistically significant (t = 6.08, 5.76, 
and 6.45; df = 9). Day-to-day im- 


provement was significant from day: 


one to day two (t = 6.08), and from 
day three to day four (t = 3.48). On 
the other hand there is little tendency 
toward an increase in correct responses 


for the control group. What slight im- 
provement is shown is confounded by 
considerable variation among days and 
sequences. 


No strong tendencies were observed 
from the data to indicate that there was 
better recognition for the ‘adjacent’ 
than for the ‘contrasting’ sequences of 
presentation. For the experimental 
group, for example, the adjacent se- 
quences gave 397 correct identifications, 
the contrasting sequences 395. This dif- 
ference is clearly not of practical im- 
portance. The low score obtained for 
Sequence One on the first day was not 
repeated on the second day and hence 
the difference is presumed to result 
from a practice effect. Sequence and 
practice effects are confounded in this 


design. 


It is interesting to compare the per 
cent recognition scores for the two 
experiments. The no-practice intelli- 
gibility score for speaker M-1 for this 
experiment averaged about 21%, com- 
pared to 32% for Experiment I, for 


‘the same amount of frequency shift. 


This difference may result from the 
greater number of vowel choices pos- 
sible in this experiment, from the ab- 
sence of a carrier phrase, or from the 
somewhat different listening conditions. 


A preliminary analysis of the types 
of errors made both before and after 
training shows no great differences 
from those found in Experiment I. Be- 
fore training, the seven most difficult 
sounds in Experiment II were appar- 
ently the [z, 3, 1, ©, a, au, ar]. After 
training (on the fourth day of listening 
by the experimental group) errors were 
spread somewhat more evenly. The 
eight most difficult sounds were [e, 4, 
3, aU, &, 0, a, e]. 











et 6 


KR? 4 Coo 


~. = sh —S2a.. bebe’ tettoue 





Tiffany, Bennett: Intelligibility, Slow-Played Speech 257 


The averages (for 10 listeners in each 
cell) given in Table 3 do not fully 
indicate the possible potential for im- 
provement. In comparing the first with 
the fourth trials for the experimental 
group, responses of individual listeners 
were found to range from a 6% in- 
crease for the poorest learner (17% to 
23%) to a 43% increase for the best 
learner (17% to 60%). 


Discussion 


These experiments have strengthened 
previous observations that a downward 
frequency shift of speech can be ex- 
pected to decrease markedly the intel- 
ligibility of that speech, at least insofar 
as vowels are concerned, in ways which 
seem to depend upon the particular 
energy-frequency distribution of the 
vowels as this varies among speakers 
and among vowels. At the same time 
the experiments demonstrate the ability 
of normal-hearing listeners to adapt to 
slow-speed distortions relatively quickly 
and easily, at least within the limits 
investigated. Thus, further investiga- 
tion seems warranted to discover what 
limits may be placed upon such adapta- 
tion and to discover what practical 
effect such distortions have upon the 
intelligibility of speech for the hard of 
hearing. Experiments are underway: or 
are being planned to explore the prob- 
lems to be encountered in the use of 
slowed speech as a ‘corrective distor- 
tion’ for the hard of hearing, as well 
as those to be encountered in the use of 
other methods of frequency shift. 

It is well known that General Ameri- 
can speech is highly redundant; that the 
vocal apparatus is capable of transmit- 
ting far more information than it is 
typically called upon to produce. It is 
also well known that, at least from the 


standpoint of speech, the ear is equally 
‘redundant,’ if not more so. It is capable 
of receiving, and usually does receive, 
far more information than needed. But 
for the greatest efficiency, transmitter 
should be matched to the receiver. It 
would be possible to cut the band 
width of both transmitter and receiver 
in half without unduly damaging 
speech communication; but speech 
communication would be damaged in- 
deed if an attempt were made to re- 
ceive the ‘top half’ of the speech with 
the ‘bottom half’ of the ear. This is 
the reasoning which might well sug- 
gest further serious attempts to shape 
the existing spectrum of normal speech 
to the narrowed abilities of the abnor- 
mal ear. There may well be traps in this 
reasoning, and certainly it is well 
known that a damaged ear is more than 
a static system with a limited band 
pass, but the obvious ability of the 
hard-of-hearing person to adapt to and 
to profit from exceedingly small cues 
to recognition would seem to indicate 
that the attempts should be made. 

One final comment: the results of 
this study appear to support the con- 
tention that recognition of syllabics is 
based upon frequency patterns rather 
than upon frequencies as such, that 
such patterns may be shifted rather 
markedly without interfering with rec- 
ognition, but that a shift beyond a cer- 
tain amount, probably governed by the 
listener’s previous language experience, 
requires a period of learning or adapta- 
tion. How far may such a pattern be 
shifted within the frequency domain? 
One might hypothesize that this would 
be limited only by the band width and 
resolving power of the ear. Further 
testing to explore: the limits appears 
highly desirable. 








258 Journal of Speech and Hearing Research 


Summary 


Two experiments were performed in 
the recognition of General American 
vowels which had been distorted 
through a slow playback of previously 
recorded CVC syllables with the con- 
sonant context held constant. The first 
experiment explored the influence upon 
intelligibility of varying amounts of 
speed change in tape recordings by 
both male and female speakers. The 
second experiment explored the influ- 
ence of a carefully specified amount of 
training upon the intelligibility of half- 
speed speech. The results of these ex- 
periments are in accord with previous 
evidence that intelligibility is markedly 
influenced by rate and speaker varia- 
tions and also demonstrated that marked 
improvement can result from short 
learning experiences. It was suggested 
that further exploration may result in 
providing corrective distortions for the 
hard of hearing. 


References 


1. ENGeLHarpt, V., and Genrcke, H., Uber 
vokale. Z. tech. Phys., 11, 1929 (as quoted 
in Y. Ochiai, General consideration on 


~I 


studies of speech qualities in rotational 
synchronous distortion. Mem. Fac. Engng 
Nagoya, 7, 1955, 36-39). 


. Farrsanks, G., Evertrr, W., and Jarcer, 


R., Method for time or frequency com- 
pression-expansion of speech. Inst. radio 
Engrs Trans. prof. Grp. audio, AU-2, 1954, 
7-12. 


. Fretcuer, H., Speech and Hearing. New 


York: Van Nostrand, 1929. 


. Kurrzrock, G., The effects of time and 


frequency distortion upon word intelli- 
gibility. Ph.D. dissertation, Univ. Illinois, 
1956. 


. Ocutar, Y., General consideration on stud- 


ies of speech qualities in rotational syn- 
chronous distortion. Mem. Fac. Engng 
Nagoya, 7, 1955, 36-39. 


. Ocurar, Y., and Fuxumura, T., Study on 


fundamental qualities of vocalic timbre 
by rotational synchronous distortion. 
Mem. Fac. Engng Nagoya, 8, 1956, 1-10. 


. Ocutar, Y., and Izumiracui, N., Timbre 


study of mishearing phenomena of speech 
phones in rotational synchronous distor- 
tion, Report I: Mishearing of vowelic 
timbre. Mem. Fac. Engng Nagoya, 7, 1955, 
49-60. 


. Ocntar, Y., Sarto, S., and Saka, Y., Ar- 


ticulation study of speech qualities in 
rotational synchronous distortion. Mem. 
Fac. Engng Nagoya, 7, 1955, 40-48. 


. Ocutar, Y., Sarro, S., and Wartanase, Y., 


Allowance problem in rotational synchro- 
nous distortion as a study on timbre dis- 


* crimination by infinitesimal postion-shift 


in the so-called timbre-space. Mem. Fac. 
Engng Nagoya, 7, 1955, 131-144. 











Children’s Articulation and Sound Learning Ability 


HARRIS WINITZ 


MARTHA LAWRENCE 


Although the research devoted to the 
etiology of functional articulation prob- 
lems has been extensive it has in general 
not indicated specific factors that may 
cause an articulation problem. In a re- 
view of the major studies in this area 
Powers (7) concludes that present re- 
search knowledge of functional artic- 
ulation cases has failed to demonstrate 
systematic deficiences for any of the 
factors studied. 

Milisen (4, p. 6) has suggested that 
‘defective articulation, a substitute re- 
sponse for normal articulation, results 
from the disruption of the normal 
learning process.’ He has speculated that 
this disruption may be the result of cer- 
tain reinforcement contingencies which 
may be operating in the environment. 
It is within this framework that the 
present investigation was undertaken. 
Specifically it was directed toward a 
comparison of sound learning of chil- 
dren with good and poor articulation. 
It was assumed that if at the present 
time the underlying learning mecha- 
nisms are alike for these two groups of 





Harris Winitz (Ph.D., University of Iowa, 
1959) is Assistant Professor of Speech Path- 
ology and Research Associate, Bureau of Child 
Research, University of Kansas. Martha Law- 
rence (B. A., University of Kansas, 1959) is a 
Graduate Assistant, Department of Speech 
Pathology, University of Kansas. This re- 
search was supported in part by U. S. Public 
Health Grant M-3987. 


Volume 4, No. 3 


259 


children then both rate and level of 
learning would be similar. It was further 
hypothesized that if the learning curves 
were dissimilar the differences could 
be accounted for by a physical factor 
or a psychological factor or a com- 
bination of these two factors. Such dif- 
ferences could reflect physical factors 
such as neuromuscular facility and per- 
ceptual skills, and psychological factors 
such as personality profiles, emotional 
problems, and poor transfer of learn- 
ing. However, if it can be demonstrated 
that children with articulation prob- 
lems can learn a newly taught sound 
task as well as children considered to 
have normal articulation then it would 
appear justifiable to assume that pres- 
ent differences in articulation are not 
a result of the present operation of 
certain physical or psychological fac- 
tors. It should be noted, however, that 
it is entirely possible that certain re- 
inforcement contingencies that tend to 
perpetuate articulation errors may still 
be operating in the environment. In 
line with this hypothesis certain psy- 
chological factors could still be present. 


Procedure 


In an attempt to study the sound 
learning of these two groups of children 
it was decided that sounds not present 
in spoken English would provide an 


September 1961 





260 Journal of Speech and Hearing Research 


effective testing instrument. The study 
was thus directed toward the following 
two questions: (a) In learning of non- 
English sounds is there a difference be- 
tween children with good and children 
with poor articulation? (b) Is there a 
difference between sexes on this learn- 
ing task? 


Selection of Subjects. Subjects were 
96 kindergarten children, 48 boys and 
48 girls, serially selected from the en- 
rollment lists of five classes made availa- 
ble by the principals of two Lawrence, 
Kansas, elementary schools. The schools 
were considered by the principals to be 
similar in their socioeconomic class 
structure. These children were given 
the Templin (9) 50-item articulation 
screening test. On the basis of their 
scores on the Templin Test, the upper 
12.5% and the lower 12.5% of the 
children in each sex group were de- 
termined. In order to select two extreme 
groups in articulation ability, those sub- 
jects in the upper 12.5% were identi- 
fied as having good articulation (high 
group), and those in the lower 12.5% 
as having poor articulation (low group). 
The sample included six boys and six 
girls at each of the levels making a 
total of 24 children. In administering 
the Templin Test, the examiner at- 
tempted to obtain sounds from the 
children in response to picture cards. 
When this was not successful, the 
children were asked to repeat the words 
after the examiner. The observer agree- 
ment of the examiner had been previ- 
ously established (10). 


Articulation screening was restricted 
to children who were white, were from 
monolingual homes, had normal hear- 
ing (no loss greater than 15 db in each 
ear at frequencies 500, 1 000, 2 000, and 


4000), as determined by the Lawrence 
public school audiologist from one to 
seven months prior to the study, and 
were considered to be of normal in- 
telligence. Children suspected of mental 
retardation by their teacher in the be- 
ginning of the school year were re- 
ferred to the school psychologist for 
individual evaluation on the Stanford- 
Binet intelligence test and were not 
included in the study if they received 
scores below 70. Any child who had 
been diagnosed as a stutterer by the 
public school speech correctionist or 
classroom teacher was not included. 
Although no specific examination was 
made of dental and facial structure, 
none of the children appeared to have 
noticeable anomalies. 

Children excluded in the original 
sampling were 10 Negro children, eight 
children who had hearing losses, five 
mentally retarded children, one child 
diagnosed as a stutterer, and one child 
of bilingual background. 

The mean age of the entire group of 
96 children was five years and eight 
months, and the mean score on the 
Templin Test was 44.60. Templin (9) 
has reported a mean score of 44.80 for 
five year olds. The 12 children in the 
high articulation group in the present 
study had a mean age of five years and 
nine months and a mean screening test 
score of 49.83 on the Templin Test. 
The 12 children with poor articulation 
had a mean age of five years and nine 
months and a mean screening test score 
of 30.58. The range of screening test 
scores was 49 to 50 for the high group 
and 18 to 39 for the low group. Two 
of the children in the low group (a boy 
and a girl) had been receiving speech 
instruction continuously for seven 
months prior to this study. 











a a= eee 





Winitz, Lawrence: Articulation, Sound Learning 261 


Learning Experiment. The learning 
task was administered within three 
weeks of the original screening. The 
procedure followed the instrumental 
reward conditioning paradigm. Here 
reinforcement was contingent upon the 
vocal matching of a discriminative 
stimulus. 

Three non-English sounds [x], [ce], 
and [¢], were selected because the ex- 
perimenters wanted to use both conso- 
nant and vowel sounds. Preliminary 
experimentation with several children 
who were not included in the main 
study demonstrated that kindergarten 
children could produce these sounds. 
Two of the sounds, [x] and [ce], were 
included in articulation diagnostic tests 
by Pettit (6). 

Kantner and West (2) describe [¢], 
as the voiceless analogue of [J], the 
nearest approximation to it in the Eng- 
lish language being the glottal fricative 
approach to [j]. This sound is called 
the German ich laut. The sound [x] is 
described as a voiceless, linguavelar 
fricative, and is called the German ach 
laut. The third sound, [ce], is a German 
vowel which is described as a rounded 
[e]. 

A tape recording of a native German 
linguist’s utterance of several produc- 
tions of the three non-English sounds, 
[ce—xa—ca], was made. One utterance 
of each sound was agreed upon by three 
judges as a correctly made production. 
These original speech sound recordings 
were made in conditions of quiet on 
a Magnecord tape recorder, Model 
PT6-65, with an Altec, Model 633A, 
condenser microphone. Tape speed was 
7.5 in. per sec. Five independent blocks 
of sounds were prepared with each 
block containing two instances of each 
sound randomly placed. With the same 


recorder the above mentioned speech 
sounds were dubbed at 5 sec intervals 
according to the randomly established 
order. For the final training tape the five 
blocks were repeated making a total 
of 10 blocks or 60 sounds. In present- 
ing the sounds to the children a Wol- 
lensak magnetic tape recorder, Model 
T 1500, was used with the volume level 
set at three. 

Each of the 24 children was brought 
into a quiet room not far from his 
classroom. The order in which the chil- 
dren were selected was randomized with 
respect to both the child’s classroom 
and the child’s presence in the high 
or low group. The examiner who did 
the original screening brought each 
child into the testing room. Another 
examiner, who was unfamiliar with the 
group (good or poor) in which the 
child had been placed, scored the child’s 
responses. 

When the child entered the room, he 
was led to a table upon which a selec- 
tion of miniature toys had been dis- 
played and was told: ‘You will get a 
chance to win one of these toys in a 
game that we are going to play.’ 

Then the child was seated beside the 
second examiner at a second table, out 
of view of the toys. The examiner told 
the child: 


We are going to play a game in which 
you will have a chance to win some 
play money, and when we finish, you 
can take the money and buy one of the 
toys. I am going to play some sounds on 
this machine, and I want you to say each 
sound right after you hear it. If you say 
the sound just like the lady on the ma- 
chine says it, you will win Bae money. 
I will take the money out of this yellow 
bowl and put it in the blue bowl. When 
you get enough play money in the blue 
bowl to buy a toy, we will stop the 

ame. Remember, you will get a toy only 
if you say the sound right. If you don’t 








262 Journal of Speech and Hearing Research 


get a piece of money for saying a sound, 
you can try to say it a different way the 
next time. 

First, we will try some practice sounds 
so that you will see how to play the 
game. Say the sounds right after you 
hear them... [pa], [ba], [a], [ta] and 
[i]. Now we will play the practice sounds 
on the machine, and then we will be ready 
to play the game. 


The recorder was turned on, and the 
child said the practice sounds, [pa], 
[ba], [a], [ta], and [i]. Each child was 
instructed to watch the bow] so that he 
would know when he said a sound cor- 
rectly. After the practice sounds were 
played, the machine was stopped, and 
the examiner said: 


Now we are ready to play the real 
game. Remember, when you say a sound 
right, you will get a piece of play money, 
but when you don’t say it right, you 
won't get any money. Let’s see whether 
you can win enough money to buy a 


toy. 

The child usually nodded when he 
understood the instructions. Only two 
or three children began to talk spon- 
taneously, and these children were 
quickly told merely to nod if they un- 
derstood the instructions. 





oO 


Mean 
Number of Correct Sounds 
T T T 
\ 
\ 
! 
' 
' 


— High Group 
--- Low Group 


nM 
T 








oO 





° 
~m 
bh 
o 
@ 
ro) 


Blocks of Six Trials 


Figure 1. Mean acquisition curves on the non- 
English sound learning task. High group, 
good articulation; low group, poor articula- 
tion. 


The machine was turned on and the 
child attempted to imitate the first 30 
sounds. Occasionally, when a child did 
not respond to the first or second stimu- 
lus within the 5 sec span between 
sounds, the response was counted as 
incorrect and the recorder was stopped. 
The child was then cautioned to try to 
imitate the sound immediately after he 
heard it and the next sound was pre- 
sented. After the first 30 trials, the re- 
corder was stopped, and the examiner 
said: 

Now, you haven’t won enough money 
to buy a toy, so we are going to play 


Taste 1. Means, standard deviations, and ranges for number of correct responses for each of 
10 six-sound blocks in the learning task by children with good (high group; N = 12) and 


poor (low group; N = 12) articulation. 











Blocks Mean SD Range 

High Low High Low High Low 

1 3. 42 1.21 441 0to4 Oto2 
2 we OT 95 1.03 0to4 Oto 3 
3 133: <8 1.65 1.04 O0to5 Oto 3 
4 A553), T200 1.49 87 O0to4 Oto2 
5 133 . 50 1.25 87 0to4 Oto2 
6 1.50 83 1.26 1.34 O0to4 Oto4 
4 142 .67 1.45 1.24 O0Oto4 Oto4 
8 138. 67 1.71 1.24 O0to5 Oto4 
9 1.75 1.00 1.48 1.53 0to4 Oto 4 
10 1.75 1.00 1.53 1.29 Otos5 Oto4 

















Winitz, Lawrence: Articulation, Sound Learning 263 


Taste 2. Summary of analysis of variance evaluating differences between groups (high and 
low) and sexes, and assessing the effect of trials. 











Source df ms F* 

Between Subjects 23 
B (Groups) 1 28.70 2.31 
C (Sex) 1 59.00 4.747 
BC 1 1.21 <i 
Error (b) 20 12.45 

Within Subjects 216 
A (Trials) 9 1.38 295°" 
AB 9 25 <i 
AC 9 12 <1 
ABC 32 <1 
Error (w) 180 47 

Total 239 








*mss/MSerror@)3 MSc/MSerror@)} MSzc/MSerror); MSa/MSerror(w)} 
MSaB/MSerror(w)} MSac/MSerror(w)} MSazc/MSerror(w)s 


{Significant at the 5% level; Fos = 4.35. 
**Significant at the 1% level; F.on = 2.66. 


the game some more. You will have to 
try harder and win more money. When 
you don’t say a sound right and don’t 
get any money, you should try to say 
the sound a different way the next time. 
Let’s see whether you can win some 
more money. 

The recorder was again turned on 


and the remaining trials completed. 


Results 


Reliability. Ten children, who had 
not been included in the original screen- 
ing (mean age, five years, 10 months; 
age range, five years, one month to six 
years, nine months), were selected for 
purposes of determining scorer reliabili- 
ty. Each child’s responses for the first 
30 sounds were scored independently 
by two observers. One examiner rein- 
forced the child out of view of the 
second observer. The percentages of 
agreement for each child ranged from 
76.7% to 100% with an average ob- 
server agreement of 86.7%. 


Learning Task. The means, standard 
deviations, and ranges for the number 
of correct responses for the high and 
the low group for each of the 10 
blocks of six sounds are presented in 
Table 1. The mean curves are plotted 
in Figure 1. The results of an analysis 
of variance,’ designated by Lindquist 
(3) as a Type III, are summarized in 
Table 2. The F ratios for this analysis 
indicate that the trial effect is signifi- 
cant but the group effect and the group- 
by-trial interaction are not significant. 

The means, standard deviations, and 
ranges for the high girls, high boys, 

*The large number of zero entries no doubt 
introduced some nonnormality into the dis- 
tribution. However, the findings of Norton 
(5) and Boneau (1) would seem to justify 
the use of an analysis of variance with these 
data. Boneau states (1, p. 62), ‘. . . the use 
of the ordinary ¢ test and its associated tat'e 
will result in probability statements whicn 
are accurate to a high degree, even though 
the ‘assumptions of homogeneity of variance 


and normality of the underlying distributions 
are untenable,’ 





264 Journal of Speech and Hearing Research 


Taste 3. Means and standard deviations for number of correct responses for each of 10 
six-sound blocks in the learning task by sex for children with good (high group) and poor 
(low group) articulation, N = 24, with six in each subgroup. 











Blocks Sex Mean SD 
High Low High Low 

i Girls 1.50  .83 138 .90 
Boys a 3300 

2 Girls 1.50 1.17 96 1.21 
Boys 33 17 Al 33 

3 Girls 1.83 1.17 121 121 
Boys 83 0 186 0 

4 Girls 1.67 1.00 1.37 1.00 
Boys 100 0 1.53 0 

5 Girls 1.67 1.00 94 1.00 
Boys 100 0 1.41 0 

6 Girls 1.83 1.67 90 1.49 
Boys 117 0 146 0 

7 Girls 1.67 1.33 : 1.37 1.49 
Boys 1.17 0 1.46 0 

8 Girls 2.00 1.33 1.63 1.49 
Boys 1.17 0 1.68 0 

9 Girls 2.17 1.67 : 1.21 1.49 
Boys 1.33 33 153 74 

10 Girls 2.17 1.33 146 1.49 
Boys 1.33 67 149 94 








low girls, and low boys are presented 
in Table 3; the means are graphed in 
Figure 2. The main effect for sex is 
significant while the first order sex 
interactions are not significant. It would 
seem reasonable to infer from .these 


3 
7] 





— High Girls 
--- Low Girls 
ew High Boys 
e-~ Low Boys 


Mean Number of Correct Sounds 














6 
Blocks of Six Trials 


Figure 2. Mean acquisition curves by sex on 
the non-English sound learning task. High 
group, good articulation; low group, poor 
articulation, 


results and from an examination of 
Table 3 that the performance of the 
girls is superior to that of the boys in 
each of the two articulation groups. 
Apparently the difference between sexes 
is greater and more consistent than 
the difference between articulation 
groups. 


As may be observed from Table 1 
the performance level of learning that 
was finally achieved was very low and 
does not appear to represent an asymp- 
totic learning level. For the 10 blocks 
the high group ranged from .83 of a 


‘sound on the first block to 1.75 of a 


sound on the tenth block, and the low 
group ranged from .42 of a sound on 
the first block to 1.00 sound on the 
tenth block, 








Number of Correct Sounds 


as ee —_—Ss | 








Winitz, Lawrence: Articulation, Sound Learning 265 


High Groups 
Boys 











6 BI | 62 | B83 

4 L 4 

2 L 

° Pee ease C (Sees ee £ | 

6 B4 | 85 | 86 

4 L a 

2+ S E Fame 

° Lipititisty z eae wake & | Cs 4 tt Aff 

Girls 

6 |S! | G2 | G3 

2 oe ae 

c a OM a aie 

° E a 
Lititiirtiiy Cpa tiritziry 

6 LS4 | G5 G6 

L | 

2-—— L A Ne 

0 —EeEEE 

0246810 0246810024 68 10 


Low Groups 
i) 


o 


7 B8 


Oonsan 
T 





Number of Correct Sounds 


onto 








B9 
sf 
4 
G9 


~~ 

BIO 

Sf” 
Girls 

6+ G7 LG8 
4 5 L 
2 a ean Sane 
o- ———__—_——_—_ }- t 
6/ Glo LGu LGi2 
4 5 L 
2. 5 § 
° ae eS a 








ewer ea wae) ee ears Sew 
0.2 4 6:8 © 0-2. 4 6 46'0 0 2 4 6:6 0 


Blocks of Six Trials 
Figure 3. Individual acquisition curves for all 
subjects on the non-English sound learning 
task. 

As may be noted from the individual 
learning curves in Figure 3, five chil- 
dren in the low group and four chil- 
dren in the high group failed to produce 
any correct responses in 60 trials. Six 
of these nine children were boys, three 
of them in the high group and three 
in the low group. The range of total 
scores for all the children was zero to 
34, the highest score being achieved by 
a boy in the high group. 

Curves for the individual sounds are 
presented in Figure 4. From the curves 
in this figure it may be noted that the 
order of difficulty for the three sounds, 
from easiest to hardest, was [ce], [¢], 
and [x]. Girls performed better than 




















@ Girls (ce) 
4b 
‘s oO Boys 
1l3- 
12- 
AS 
FO} & 
IE 
Ww a 
2 SE 
J 
oO b— 
2) 
—~ 4b 
.2) 
o a 
S 3b ' 
S ae 
3 a 
= b~ 
p= ] 
ae I = 
¢ 5 
2 O 
= 
f (5) 
4b 
Bo | 
= 
Ai oe 
10) 
‘2 I (x) 
4 , p—-e p« 
0 t ip igh — tek Geo 
0 2 4 6 8 10 


Blocks of Two Trials 
Figure 4. Mean acquisition curves for boys 
and girls separately for each of the three non- 
English sounds [ce], [x], and [¢]. 





266 Journal of Speech .and Hearing Research 


boys on all three of the non-English 
sounds. The same order of difficulty 
for the three sounds, although not 
shown in the figure, was found for the 
high and low groups. 


Discussion 


The results of this study appear to 
indicate that kindergarten children with 
good and with poor articulation are 


equally facile in learning to perform a. 


sound task consisting of sounds not 
present in the English language. This 
finding would appear to contraindi- 
cate the presence of any factors either 
physical or psychological that may 
inhibit or decelerate such learning. 
Once again it should be mentioned that 
certain reinforcement contingencies may 
still be, and most likely are, operative. 

he paradigm employed in this study 
merely provides for the learning of 
new responses under a specific schedule 
of reinforcement; it does not provide 
or a testing of reinforcement con- 
tingencies that may still occur in the 
environment. 


It is tenable, however, that organic 
factors were the cause of a breakdown 
in speech learning some time prior to 
kindergarten age but are now no longer 
operative. Since organic factors have 
never been clearly shown to operate in 
the usual functional articulation case, 
and since the results of this study pro- 
vide no evidence that level of articula- 
tion functioning of kindergarten 
children is related to sound learning 
ability, it would appear to be reason- 
able to assume that certain, as yet un- 
identifiable, learning factors which have 
operated in the past may account for 
differences in their articulation func- 
tioning. 


The significant sex factor might pos- 
sibly indicate, if the idea of organic 
factors is accepted, that there are physi- 
cal differences between sexes in the 
speech mechanisms, while at the same 
time there is no such difference which 
influences present level of articulation 
functioning. It would seem far more 
reasonable, however, to conclude that 
the apparent differences between sexes 
are in some way related to cultural 
factors. The sex difference was a sur- 
prising result in view of recent find- 
ings by Winitz (10), who has reported 
extremely small, and in most instances 
nonsignificant, differences between sexes 
on a variety of language measures. In 
the present study the differences be- 
tween boys and girls again were not 
large but the number of repeated meas- 
urements made on each child probably 
made for a high degree of precision 
which allowed small differences to be 
apparent. 

In summary, it would seem that dif- 
ferences in articulation ability may be 


‘due to some rather complex reinforce- 


ment contingencies that have operated 
in the past or still operate, for when 
learning conditions are made similar, 
as in this study, differences between 
children with good articulation and 
those with poor articulation are not 
apparent in rate or level of learning. 
Since little is known about the factors 
that account for language learning (al- 
though much information is available 
on the stages of language growth), it 
is not as yet clear what the reinforce- 
ment contingencies might be. 

Until further studies within this 
framework are conducted the viewpoint 
of the authors should be considered as 
tentative. With regard to the procedure 
of the present study at least two criti- 














Winitz, Lawrence: Articulation, Sound Learning 267 


cisms might be advanced. First, as 
measured by the Templin Test, the 
difference in articulation performance 
between the two articulation groups 
was not as great as had been expected; 
and second, the number of trials used 
in this study might not have afforded an 
opportunity to examine completely the 
learning curves of the two groups. 

An interesting by-product of this 
study was the observation of sound 
errors made by subjects in their at- 
tempts to produce the test sound cor- 
rectly. Although no systematic analysis 
was made of these errors it appeared 
that the subjects were making closer 
and closer approximations to the test 
sound until finally the sound was said 
correctly. Skinner (8), in his research 
with animals, has referred to this learn- 
ing procedure as the successive approxi- 
mation method. He has conditioned 
animals to perform tasks that originally 
were not necessarily in their repertoire 
of responses by reinforcing closer and 
closer approximations to the desired 
task. For example, rats have been con- 
ditioned to increase continually the in- 
tensity of their bar press by reinforcing, 
in successive stages, bar presses of a 
certain magnitude of intensity and with- 
holding reinforcement from bar presses 
of less than a certain magnitude of in- 
tensity. Future studies might be directed 
toward an understanding of the con- 
tinuum along which articulation re- 
sponses should be reinforced to increase 
most effectively acquisition of the de- 
sired sound. 

In addition it would seem that the 
use of non-English sounds, as employed 
in this study, presents a useful learning 
paradigm for the testing of important 
variables in articulation learning. This 
paradigm may be used to investigate the 


effect of such variables as position of 
sound, preceding and succeeding vow- 
els, blends, auditory discrimination 
training, reinforcement schedules, gen- 
eralization training, delay of reward, 
and reinforcing stimuli. 


Summary 


Sound learning ability of 12 kinder- 
garten children (6 F and 6 M) having 
good articulation was compared with 
that of 12 kindergarten children (6 F 
and 6 M) having poor articulation. Sub- 
jects were selected from a group of 96 
kindergarten children on the basis of 
scores on the Templin 50-item screen- 
ing articulation test. The non-English 
sounds, [x], [ce], and [¢] were used, 
each appearing twice in a block of six 
sounds with a total of 10 blocks. 

Each child was asked to imitate 
sounds presented by tape recorder. A 
correct response was reinforced with 
play money later exchanged for a toy. 
Results indicate that some kindergarten 
children can learn the non-English 
sounds used, and that girls’ performance 
is superior to boys’. No evidence was 
obtained to support the hypothesis of 
a difference in learning ability between 
groups differing in articulation ability 
as measured by the Templin Test. These 
findings were discussed in relation to 
a learning theory of articulation dis- 
orders. 


Acknowledgement 


The authors express their appreci- 
ation to the research committee of the 
Lawrence, Kansas, public schools for 
permitting this study to be undertaken. 
Gratitude is expressed also to Drs. 
Frances Horowitz, Joseph Spradlin, 
James Neelley, and Gerald Siegel for 
their suggestions and criticisms. 





268 Journal of Speech and Hearing Research 


References 


1. Boneau, C. A., The effects of violations 
of assumption underlying the ¢ test. 
Psychol. Bull., 57, 1960, 49-64. 

2. Kantner, C. E., and West, R. W., 
Phonetics. New York, London: Harper, 
1941. 

3. Linguist, E. F., Design and Analysis of 
Experiments in Psychology and Educa- 
tion. Boston: Houghton Mifflin, 1953. 

4. Misen, R., A rationale for articulation 
disorders. J. Speech Hearing Dis. 
Monogr. Suppl., 4, 1954, 5-17. 

5. Norton, D. W., An empirical investiga- 
tion of the effects of nonnormality and 
heterogeneity upon the F-test of analysis 
of variance. Ph.D. dissertation, Univ. 
Towa, 1952. 


von Bekesy, Tato, van Dishoeck 
Round-Table Chairmen for 1962 
International Congress of Audiology 


>The sixth International Congress of Audi- 
ology will be held in Leyden, The Nether- 
lands, from September 5 to 8, 1962. The 
president is Dr. H. A. E. van Dishoeck and 
the secretary, Dr. A. Spoor. 

In the program are planned three round- 
table talks, with associated and free papers. 
The subjects of the round tables and the 


6. Pernt, C. W., The predictive efficiency 
of a battery of articulatory diagnostic 
tests. Speech Monogr., 24, 1957, 219-226. 

7. Powers, Marcaret Hatt, Functional dis- 
orders of articulation—symptomatology 
and etiology. Chap. 23 in L. E. Travis 
(Ed.), Handbook of Speech Pathology. 
New York: Appleton-Century-Crofts, 
1957. 

8. Skinner, B. F., The Behavior of Organ- 
isms; an Experimental Analysis. New 
York, London: Appleton-Century, 1938. 

9. Tempiin, Mitprepo C., Norms on a 
screening test of articulation for ages 
three through eight. J. Speech Hearing 
Dis., 18, 1953, 323-331. 

10. Wrn11z, H., Language skills of male and 
female kindergarten children. J. Speech 
Hearing Res., 2, 1959, 377-386. 


B RESEARCH NEWS NOTE 


moderator of each are: ‘Frequency analysis of 
the normal and pathological ear, Dr. G. von 
Bekesy; ‘Central deafness in children,’ Dr. 
J. M. Tato; ‘Psychogenic deafness and simu- 
lation,’ Dr. van Dishoeck. 

Official languages of the Congress are Eng- 
lish, French, German, and Spanish. Working 
languages will be English and French. 

The address of the secretariat is: 


Ear-Nose-Throat Department 
Academisch Ziekenhuis 
Leyden, The Netherlands 











Pee ee ae a 














Reliability of Conditioned GSR 
Pure-Tone Audiometry with Adult Males 


JOSEPH B. CHAIKLIN 


IRA M. VENTRY 


LYMAN S. BARRETT 


Since its introduction (5, 6) condi- 
tioned galvanic skin response (GSR) 
pure-tone audiometry has gained in- 
creasing acceptance as a method for 
measuring auditory thresholds (10, 25). 
The validity and reliability character- 
istics of a test in such wide use are 
essential information for those who 
must use it and assess its results. In 
1954 Doerfler and McClure (12) pub- 
lished data demonstrating the validity 
of conditioned GSR pure-tone audiom- 
etry with hard-of-hearing adults. They 
found that pure-tone thresholds meas- 
ured with conditioned GSR audiom- 
etry were generally within +5 db of 
pure-tone thresholds measured with 
voluntary (conventional) pure-tone 
audiometry. Doerfler and McClure em- 
ployed systematic conditioning and 
sampling schedules, specific response 





Joseph B. Chaiklin (Ph.D., Stanford Uni- 
versity, 1958) and Ira M. Ventry (PhD., 
Stanford University, 1958) are Clinical Super- 
visors, and Lyman S. Barrett (Ph.D., 1959, 
Stanford University) is Chief, Audiology and 
Speech Pathology Clinic, Veterans Admin- 
istration Hospital, San Francisco. Drs. Chaik- 
lin and Ventry are also Research Associates, 
San Francisco Institute of Medical Sciences. 
This article is based on a paper presented at 
the 1960 Convention of the American Speech 
and Hearing Association, Los Angeles. 


Volume 4, No. 3 


269 


criteria (latency, magnitude, and slope), 
and a conditioning criterion. Burk (7) 
later reported GSR validity data for 
normal-hearing subjects. His data were 
similar to those of Doerfler and Mc- 
Clure. Burk’s description of his meth- 
odology indicates that response cri- 
teria, a conditioning criterion, and 
a planned sampling procedure were 
used. While the validity of conditioned 
GSR pure-tone audiometry with adults 
has been studied, there do not appear 
to be any published reports of research 
concerned with its reliability. The 
purpose of the present study was to 
assess the test-retest reliability of con- 
ditioned GSR audiometry with adult 
males. 

One approach to evaluating the test- 
retest reliability of GSR audiometry 
is to assess the reliability of candidacy 
for GSR audiometry; that is, to evalu- 
ate whether subjects meet a GSR con- 
ditioning criterion on repeated tests. 


7On logical grounds a test which has been 
demonstrated to be valid may be presumed 
to be reliable (21, pp. 94-95). The empirical 
demonstration of reliability, however, has 
value in defining a test’s actual degree of re- 
liability. Test developers usually assess both 
validity and reliability before offering a test 
for general use. 


September 1961 








270 Journal of Speech and Hearing Research 


Another approach, the one that re- 
ceived major emphasis in the present 
study, is concerned with the test-retest 
reliability of conditioned GSR pure- 
tone thresholds for subjects who meet 
a conditioning criterion on repeated 
tests. In following the latter approach 
data were used from subjects who met 
a GSR conditioning criterion on two 
tests and for whom conditioned GSR 
thresholds at 1000 cps were obtained 
both times. This plan, of course, pre- 
cluded evaluation of the data of subjects 
who failed to meet the conditioning 
criterion on one of the two tests. 


Procedure 


Subjects. The subjects were 41 adult 
males, 18 with bilateral hearing loss 
and surgically demonstrated otoscle- 
rosis and 23 with normal hearing in at 
least one ear at 500 cps, 1 000 cps, and 
2 000 cps. Normal hearing was defined 
as a threshold no higher than the 10 
db hearing level re audiometric zero 
(1). The otosclerotic subjects ranged 
from 30 to 52 years of age with a mean 
age of 40.4 years. The normal-hearing 
subjects ranged from 23 to 53 years 
of age with a mean age of 33.6 years. 
All subjects combined ranged from 23 
to 53 years of age with a mean age of 
36.6 years. 


Subjects were selected by the follow- 
ing criteria: (a) negative neurological 
history, (b) report of stable hearing 
acuity in the ear under test, and (c) 
absence of ear, nose, or throat con- 
ditions which might cause abnormal 


threshold fluctuations. Subjects also had. 


to meet a GSR conditioning criterion 
during each of two experimental ses- 
sions spaced approximately one month 
apart. All subjects participated on a 


voluntary basis and none was informed 
of the purpose of the experiment until 
after the second experimental session. 
The otosclerotic subjects were veterans 
who had had fenestration or stapes 
mobilization surgery performed on one 
or both ears. The normal-hearing sub- 
jects were friends and acquaintances of 
the experimenters and their colleagues. 
Some of the normal-hearing subjects 
were hospital employees but none was a 
hospital patient. 


At the start of each subject’s first 
experimental session one ear was se- 
lected for testing. The same ear was 
retested during the second experimental 
session. If only one ear of an otoscle- 
rotic subject had undergone surgery, 
then his unoperated ear was selected for 
testing. If both ears had been operated 
on, the ear selected was the one having 
had the earliest surgery. The mean 
lapsed postsurgical time was 58 months, 
and ranged from 17 to 126 months. The 
right ear of a normal subject was 
selected for testing if it met the audio- 


‘metric and medical selection criteria. 


The left ear of a normal subject was 
selected only if the right ear failed to 
meet the selection criteria. 
Equipment. Testing was performed 
with a Grason-Stadler psychogalvanom- 
eter (Model E-664) and an Allison 
Laboratories audiometer (Model 21C) 
equipped with Telephonics Corporation 
10-ohm dynamic earphones (Model 
TDH 39). The psychogalvanometer 
contains a Sanborn graphic recorder 
(Model 127) and an automatic stimulus 
timing switch including provision for 
automatically marking the graphic rec- 
ord to indicate onset of stimuli. A 20 db 
pad was inserted into the earphone line 
during tests of subjects with normal 





on Ce rw aS eee 


— Oo Ww 





Chaiklin, Ventry, Barrett: GSR Audiometry Reliability 271 


hearing. During the course of the study 
the stability of the audiometer’s calibra- 
tion was checked periodically with an 
Allison Laboratories Calibration Unit 
(Model 300). Tests were conducted 
with subjects seated in an Industrial 
Acoustics Company Audiometric Test- 
ing Room (Model 1203). Under test 
conditions the ambient noise level in 
the room was 38 to 42 db re 0.0002 
dyne/cm? as measured on the C-scale 
of a General Radio Company Sound 
Level Meter (Type 1551-A). The am- 
bient noise level was so low that no 
reading could be obtained on the A or 
B scales. The chair in which subjects 
were seated had wide, padded arm rests 
designed to facilitate arm stability dur- 
ing GSR audiometry. The test equip- 
ment and the experimenter were located 
outside the test room. The experimenter 
could see the subject through an ob- 
servation window, but the subject 
could not see the experimenter without 
turning toward the observation win- 
dow. 


Experimental Sessions. Each subject 
was tested twice; the interval between 
tests ranged from 27 to 43 days with a 
mean interval of 31.9 days. 


All experimental sessions consisted of 
an instruction period, a pre-GSR volun- 
tary threshold measurement at 1 000 
cps, a GSR conditioning period, condi- 
tioned GSR threshold sampling at 1 000 
cps, and a post-GSR voluntary thresh- 
old measurement at the same frequency. 
At the start of the first experimental 
session normal-hearing subjects received 
threshold measurements at 500 cps, 
1 000 cps, and 2 000 cps. Except for this 
preliminary screening procedure, the 
experimental sessions for both groups 
were the same. Most of the sessions 


lasted approximately 45 minutes, al- 
though a few lasted more than an hour. 
A different experimenter conducted the 
second experimental session for each 
subject. The experimenter had no 
knowledge of the subject’s previous 
thresholds. Testing was performed by 
two experimenters (JBC and IMV), 
with each conducting about an equal 
number of first and second tests. 


Instructions. Before each of the ex- 
perimental sessions the following in- 
structions were read to the subject: 


You are going to have another test of 
your hearing. Before the test starts I will 
fasten two discs to the fingers of each of 
your hands. During the test you will hear 
tones in your (right, left) ear, and with 
some of the tones you will feel an electric 
shock on the fingers of your right hand. 
I will adjust the shock to a point which 
you report as distinctly unpleasant but not 
painful. The test will require you to re- 
main as still as possible. If it is absolutely 
necessary for you to move, try to limit 
such moves to the periods immediately 
following the shocks. Be certain to settle 
down quickly after each movement. It is 
important that you stay alert during the 
test. Keep your eyes open and do not 
let your gaze become fixed too long on 
one spot. On the other hand, you should 
not move your head around to change 
your view or look at me. There will be 
long periods during which you will hear 
nothing. You should stay alert during 
these periods. When I say to you ‘The 
test will begin now’ you will not have to 
give me any further signals when you 
hear something in the earphones or when 
you feel a shock. The entire test will last 
from 35 to 55 minutes, and it will be 
divided into 2 parts. The first part will be 
from 10 to 30 minutes long. The second 
part will be about 20 minutes long. There 
will be a 5 minute rest period between the 
first and second parts of the test. 


Subject Preparation. After the in- 
structions were read, flat silver-alloy 
shock electrodes 0.25 in. in diameter 
were attached to the pads of the index 
and ring fingers of the right hand; 








272 Journal of Speech and Hearing Research 


silver-alloy pick-up (resistance) elec- 
trodes 0.5 in. in diameter were attached 
to the pads of the index and ring fingers 
of the left hand, and a steel EKG 
ground electrode was attached to the 
ventral surface of the right wrist. Be- 
fore the electrodes were attached the 
appropriate skin surfaces were cleaned 
with rubbing alcohol. Cambridge di- 
rect-contact electrode jelly was used 
to improve electrical contact between 
the electrodes and skin surfaces. After 
the electrodes were attached, the sub- 
ject was told that further instructions 
would be administered through the ear- 
phones, the earphones were positioned, 
and the test room doors were closed. 


Shock Adjustment. The next step 
was adjustment of the pulsatile shock 
that served as the unconditioned stimu- 
lus (UCS) during the GSR condition- 
ing and threshold sampling. The subject 
was told that the shock would be in- 
creased in small steps until he reported 
that it was ‘distinctly unpleasant, but 
not painful.” The experimenter pre- 
sented an ascending series of 0.5-sec 
shocks (4) until the subject reported 
that the shock was unpleasant. The ex- 
perimenter then asked the subject if he 
could tolerate a stronger shock. If the 
subject indicated that he could tolerate 
it stronger, the UCS strength was in- 
creased until he reported he could 
tolerate it no higher (11). The UCS 
strength on the first test ranged from 
.58 ma to 4.10 ma with a mean of 1.78 
ma. The UCS strength for the second 
test ranged from .58 ma to 4.30 ma with 


*It has been the experience of the authors 


that many subjects can tolerate a UCS 
stronger than the UCS they initially report as 
distinctly unpleasant. It appears that request- 
ing an emphatic report of unpleasantness 
serves to attain the strongest tolerable UCS, 


a mean of 1.94 ma. The maximum UCS 
available was 3.6 or 4.3 ma depending 
on which of three psychogalvanometers 
was used. After the maximum tolerable 
UCS was determined the subject was 
told that it would remain at that 
strength for the remainder of the test. 


Voluntary Pure-Tone Threshold 
Measurement. The final procedure be- 
fore conditioning was a threshold meas- 
urement at 1000 cps. The stimulus 
timing switch of the Grason-Stadler 
psychogalvanometer was used to de- 
liver a 1000 cps stimulus produced by 
the Allison Audiometer. The stimulus 
time was one second, the same duration 
used later during conditioning and 
sampling. Voluntary threshold meas- 
urements employed an ascending 
‘on-effect’ procedure similar to the 
procedure recommended by Carhart 
and Jerger (8). They recommend 
ascending only to the point at which 
a response is elicited. The primary de- 
viation from their methodology was to 
ascend 5 db above the point at which 
the subject first responded in each 
ascending series. 

The word ‘yes’ was used as the re- 
sponse signal during the pre-GSR pure- 
tone threshold measurement. After the 
pre-GSR threshold measurement the 
subject was told that the test would 
begin and that he would not have to 
give any further signals when he heard 
something in the earphones or felt a 
shock. The procedure described above 
was used to measure the 1000 cps 
threshold at the end of the test session. 


Conditioning Process. A randomized 
schedule of 40 stimulus events and 39 
interevent intervals was followed dur- 
ing the conditioning process. If the 
schedule called for a conditioned stimu- 





a ae | 


lS eC ee ee 


=~ WM 0) Hn OF Pree wn == | PH KK ANA KR ope eet 


ee Or aS S)— ia 





Chaiklin, Ventry, Barrett: GSR Audiometry Reliability 273 


lus (CS) the experimenter presented 
a 1-sec 1000 cps tone 10 db above the 
pre-GSR voluntary threshold of the ear 
under test (13, 16). If a reinforced 
event (CS-UCS) was prescribed, he 
presented a 1-sec 1000 cps tone at the 
same level, followed 0.5 sec after its 
onset by a 0.5-sec shock which con- 
tinued to the end of the tone. A 0.5-sec 
CS-UCS interval (2, 22, 24, 31) was 
used for all reinforcement events in the 
conditioning and sampling processes. 
Approximately 40% partial random 
reinforcement 10 db above voluntary 
threshold was used during conditioning 
(15, 20, 23). Intervals of 30, 45, or 60 
sec were used between events (27). An 
equal number of each of these three 
intervals was randomly dispersed in the 
schedule. If the entire schedule was 
used it required 30 minutes. 


The purpose of the conditioning 
process was to bring the subject to a 
level of responsiveness at which he pro- 
duced acceptable GSRs consistently 
when stimulated by unreinforced stim- 
uli (CS without UCS).* Requirements 
for an acceptable response were a la- 
tency of 1.5 to 3.5 sec, a minimum 
magnitude of 1 mm, and a minimum 
slope of 45°. Latency was judged from 
the onset of the CS to the onset of the 
GSR. The conditioning criterion had 
to provide reasonable assurance that in 
subsequent GSR threshold sampling an 
absence of a GSR to a CS, in most in- 
stances, would indicate that the CS was 
not audible, or that it was barely audi- 


*The work of Cook and Harris (9) sug- 
gests that some subjects may be verbally 
conditioned by instructions given before for- 
mal GSR conditioning procedures. Experi- 
ence of the authors suggests that there are 
also subjects who respond reliably only when 
subjected to a formal conditioning process. 


ble. The conditioning criterion required 
three successive responses to unrein- 
forced CSs after the fifth scheduled 
event. The evaluation of conditioning 
was delayed until after the fifth event 
as a precaution against premature 
judgment of conditioning. Scheduled 
reinforced events were allowed to inter- 
vene but were not judged in tallying 
responses for the conditioning criterion. 
Subjects met the conditioning criterion 
in an average of 8.5 minutes on the first 
test and 6.4 minutes on the second test. 
After the conditioning criterion was 
met the earphones were removed and 
the subject was given a five minute rest 
period before threshold sampling com- 
menced. If the subject did not meet the 
conditioning criterion during the first 
test session he was still rescheduled for 
his second experimental session. The 
data of 15 subjects (nine normals and 
six otosclerotics) who did not meet the 
conditioning criterion on one or both 
tests are not reported here. These 15 
subjects represent 25% of the 59 sub- 
jects who were seen for two experi- 
mental sessions. Of the 15 subjects, 
there were nine who failed to meet the 
conditioning criterion during both ex- 
perimental sessions. 


GSR Threshold Sampling. After the 
rest period the earphones were re- 
placed, the test room doors were closed 
again, and threshold sampling was 
started. The experimenters followed 
a 30-event sampling schedule which 
called for four ascending series of CS 
presentations with each series including 
CSs at —10, —5, 0, +5, and +10 db re 
pre-GSR voluntary threshold. The 
ascending GSR sampling procedure was 
patterned after the authors’ clinical ap- 
plication of GSR audiometry and was 





274 Journal of Speech and Hearing Research 


similar to the ascending technique used 
for voluntary pure-tone threshold 
measures. 

Five contro] events were dispersed 
in the schedule (one in a different posi- 
tion in each ascending series) as a check 
on the validity of responses to the sam- 
pling CSs. A control event consisted of 
a stimulus marker on the graphic rec- 
ord, simulating presentation of a CS 
but with no CS presented to the subject. 
If frequent enough, random responses 
during GSR audiometry may invalidate 
a test. A system of control presenta- 
tions, therefore, is essential to check on 
the frequency of random responses. 

The intervals between events on the 
sampling schedule were randomly ar- 
ranged in a manner similar to that de- 


scribed for the conditioning process. 
Five CS-UCS events 10 db above volun- 
tary threshold were distributed ran- 
domly in the sampling schedule. The 
sampling process took approximately 
22 minutes. After threshold sampling 
was completed a post-GSR 1000 cps 
threshold was measured and the test 
was terminated. The subject’s GSR 
record was assigned a randomly selected 
code number and the sampling portion 
of the record was detached from the 
conditioning portion. The sampling 
portion of the graphic record contained 
no information identifying the subject 
or the levels at which stimuli were 
presented. 


Analysis of the GSR Sampling Rec- 
ords. At the conclusion of the study the 


Taste 1. Test-retest conditioned GSR and pre-GSR voluntary pure-tone (VPT) thresholds 
at 1000 cps for normal-hearing subjects (N = 23). All threshold values are in decibels re 


audiometric zero. 











Subject First Retest Diff First Retest Diff 
No GSR GSR VPT VPT 
1 5 0 -—5 0 0 0 
2 10 5 —5 0 0 0 
3 5 -—5 —10 -5 -—5 0 
4 0 5 5 -5 0 5 
5 5 5 0 5 0 —5 
6 0 —5 -—5 —5 -—5 0 
7 b} 0 —5 0 —5 —5 
8 0 —5 -—5 -5 -—5 0 
9 —5 —5 0 —5 —5 0 
10 5 5 0 0 0 0 
11 10 10 0 > 5 0 
12 —5 —5 0 —5 —5 0 
13 -—5 —10 -—5 —10 —10 0 
14 0 —5 —5 -—5 -—5 0 
15 10 15 5 5 10 5 
16 5 0 —5 0 0 0 
17 —10 —5 5 —10 —5 >) 
18 5 0 —5 0 0 0 
19 0 0 0 —5 —5 0 
20 -—5 0 5 —10 —5 5 
21 0 -—5 - 5 -—5 -5 0 
22 —10 —10 0 —10 —5 5 
23 —5 -5 0 —10 —10 0 
Mean .87 — 87 — 1.74 — 3.48 — 2.83 65 
SD 5.77 5.95 4.02 4.71 4.19 2.65 














i Peek We Bee . ie ie i ee os) ee 


Zi Df. A 








Chaiklin, Ventry, Barrett: GSR Audiometry Reliability 275 


sampling portions of the graphic 
records were analyzed by three experi- 
enced judges who applied the previ- 
ously described response criteria to each 
sampling event. Each event was judged 
as being either a ‘response’ or a ‘no 
response.’ Responses had to meet all 
three response criteria. The three 
judges’ analyses were identical on 
96.9% of the 2 049 events analyzed. 
Threshold Definition. Three thresh- 
old estimates (one for each judge) were 
derived for each test. The threshold 
value derived from the analyses of at 
least two of the three judges was the 
final GSR threshold for each test. 
Thresholds derived from the separate 
sampling analyses were based on the 
relation between the number of accept- 
able responses to control and sampling 
events. If there were no responses to 
controls, threshold was set as the lowest 
hearing level at which there were at 


least two responses to the four stimuli 
presented at that level. If there was a 
response to only one of the control 
events threshold was the lowest hearing 
level at which there were at least three 
out of four responses; if there were two 
responses to controls then four re- 
sponses were required at one of the 
sampling levels. If there were three or 
more responses to the five control pres- 
entations, the record was considered 
invalid. The data of three otosclerotic 
subjects were discarded because of an 
unacceptable relationship between re- 
sponses to control and responses to CS 
presentations. 


Results 


Test-Retest Reliability Data. Table 1 
shows the test-retest GSR and volun- 
tary thresholds at 1000 cps for the 23 
normal-hearing subjects. An inspection 
of Table 1 indicates that 22 (95.7%) 


Taste 2. Test-retest conditioned GSR and pre-GSR voluntary pure-tone (VPT) thresholds at 
1000 cps for ostosclerotic subjects (N = 18). All threshold values are in decibels re audi- 


ometric zero. 











Subject First Retest Diff First Retest Diff 
No GSR GSR VPT VPF 
1 35 35 0 35 35 0 
2 65 65 0 65 65 0 
3 45 40 —5 45 45 0 
4 60 60 0 60 65 +5 
5 35 30 —5 30 30 0 
6 25 20 —5 25 20 —5 
i 50 45 —5 45 45 0 
8 30 30 0 25 25 0 
9 35 40 +5 35 35 0 
10 40 40 0 35 40 +5 
11 20 30 +10 30 30 0 
12 15 15 0 20 20 0 
13 55 60 +5 55 60 +5 
14 20 20 0 15 15 0 
15 25 30 +5 25 25 0 
16 45 40 —5 45 50 +5 
17 30 35 +5 30 30 0 
18 45 40 —5 45 45 0 
Mean 37.50 37.50 0.0 36.94 37.78 83 
SD 13.73 13.22 4.37 13.32 14.69 2.48 











276 Journal of Speech and Hearing Research 


normal-hearing subjects had retest GSR 
thresholds within +5 db of their first 
GSR thresholds. One normal-hearing 
subject had a retest GSR threshold 10 
db lower than his initial GSR threshold. 
Table 2 shows a similar finding for the 
18 otosclerotic subjects. The retest GSR 
thresholds of 17 (94.5%) otosclerotic 
subjects were within +5 db of their 
first GSR thresholds. One subject had 
a retest GSR threshold 10 db higher 
than his first GSR threshold. A chi- 
square analysis of the data for the two 
groups indicated that there was no sig- 
nificant difference between the normal- 
hearing and otosclerotic subjects in the 
threshold changes that took place from 
one GSR test to another (chi square = 
.73, df = i). Since there was no sig- 
nificant difference between the two 
groups in the stability of GSR thresh- 
olds, the results of the two groups were 
combined. Analysis of the combined 
results reveals that the retest GSR 
thresholds of 39 subjects (95.1%) were 
within +5 db of their first GSR thresh- 
olds. Only two subjects had a +10 db 
difference from one GSR test to an- 
other. It appears, therefore, that condi- 
tioned GSR audiometry has high 
reliability as a method for measuring 
pure-tone thresholds. 

The reliability of GSR audiometry 
was also evaluated by comparing its 
reliability to the reliability of pure-tone 
audiometry. A generally accepted cri- 
terion of reliability for voluntary 
pure-tone audiometry is a test-retest 
difference no greater than +5 db (3, 
32). The GSR reliability data presented 
above indicate that 95.1% of the sub- 
jects met this criterion. A closer inspec- 
tion of Tables 1 and 2, however, reveals 
that the reliability of voluntary pure- 
tone thresholds is slightly higher than 


the reliability of GSR pure-tone thresh- 
olds. For example, Table 1 shows that 
all 23 normal-hearing subjects had retest 
voluntary thresholds at 1 000 cps within 
+5 db of their initial voluntary thresh- 
olds. Of these, 16 subjects (69.6%) re- 
peated their thresholds exactly. Table 2 
indicates that all otosclerotic subjects 
repeated their voluntary thresholds 
within +5 db, with 13 subjects 
(72.2%) repeating their thresholds 
exactly. All 41 subjects, therefore, had 
test-retest voluntary pure-tone thresh- 
olds that agreed within +5 db. As 
would be expected, there was no signifi- 
cant difference between the two groups 
in the stability of their voluntary 
thresholds (chi square = .034, df = 1). 
It should be noted that only pre-GSR 
voluntary pure-tone thresholds were 


used in analyses involving voluntary 
thresholds. 


Test-retest GSR and voluntary pure- 
tone thresholds were compared on the 
basis of change (+5 db and +10 db) 


. versus no change (0 db). For the oto- 


sclerotic subjects, there was no signifi- 
cant difference (chi square = 2.5, df = 
1) between the stability of GSR thresh- 
olds and the stability of voluntary 
pure-tone thresholds. For normal-hear- 
ing subjects and for all subjects com- 
bined, voluntary pure-tone thresholds 
were significantly more stable than 
GSR thresholds (chi square = 4.08 for 
normals and 7.68 for all subjects com- 
bined; df = 1 for both analyses).* It 
appears, then, that voluntary pure-tone 
thresholds are somewhat more reliable 
than GSR thresholds. This conclusion 
appears to be supported by the smaller 


“The three chi-square values for voluntary- 
GSR reliability comparisons were corrected 
for continuity. 








mil AO DWC 6 @ . 


Qa Oo 








Chaiklin, Ventry, Barrett: GSR Audiometry Reliability 277 


Taste 3. Differences between voluntary pure- 
tone (VPT) thresholds and GSR  thresh- 
olds at 1000 cps for normal-hearing subjects, 
otosclerotic subjects, and all subjects com- 
bined. Each test or retest VPT threshold is 
compared to its corresponding test or retest 
GSR threshold. 











Difference between N Per Cent 
VPT and GSR 
Normals 
0 db 17 37.0 
+5 db 27 58.7 
+10 db z 43 
Otosclerotics 
0 db 19 52.8 
+5 db 15 41.7 
+10 db 2 5.5 
Combined 
0 db 36 43.9 
+5 db 42 $L2 
+10 db + 4.9 








standard deviations for the difference 
scores for voluntary pure-tone audiom- 
etry (see Tables 1 and 2). The higher 
reliability of voluntary pure-tone audi- 
ometry, however, is a product of the 
criterion used to define threshold 
change. The chi-square results de- 
scribed above were based on a change 
versus no-change dichotomy, with a 
+5 db difference between test-retest 
thresholds interpreted as a change. It 
has been previously noted, however, 
that a +5 db change is an accepted 
criterion of reliability for voluntary 
pure-tone audiometry. If the change 
versus no-change dichotomy had been 
interpreted as +10 db for change and 
+5 db (and 0 db) for no change, there 
obviously would be no significant dif- 
ference between the reliability of GSR 
audiometry and the reliability of volun- 
tary audiometry. 


Validity of Conditioned GSR Audi- 
ometry. The data obtained during the 


reliability study were also used to eval- 
uate the validity of conditioned GSR 
audiometry. This was accomplished by 
comparing each subject’s initial GSR 
threshold to his initial voluntary thresh- 
old and each subject’s retest GSR 
threshold to his retest voluntary thresh- 
old. There were 46 such comparisons 
(23 initial test comparisons and 23 
retest comparisons) for the normal- 
hearing subjects and 36 comparisons 
(18 initial test comparisons and 18 retest 
comparisons) for the otosclerotic sub- 
jects. A total of 82 threshold compari- 
sons were made for the two groups 
combined. As indicated in Table 3, 
normal-hearing subjects had GSR 
thresholds that were within +5 db of 
the voluntary pure-tone thresholds in 
44 of 46 (95.7%) threshold compari- 
sons. For the otosclerotic subjects GSR 
thresholds were within +5 db of the 
voluntary thresholds in 34 of 36 
(95.7%) threshold comparisons. A chi- 
square analysis indicated that there was 
no significant difference between the 
two groups in the relationship between 
GSR thresholds and voluntary pure- 
tone thresholds (chi square = .004, 
df = 1). For the combined groups, 
then, GSR thresholds were within +5 
db of the voluntary pure-tone thresh- 
olds in 78 of 82 (95.1%) threshold 
comparisons. These results are in good 
agreement with the validity data of 
Doerfler and McClure (12) and of 
Burk (7). 


Discussion 

The data of this study do not support 
Nober’s (26) contention that ‘... even 
with recent standardization of apparatus 
and technique the GSR [audiometry] 
is not completely reliable and is best 





278 Journal of Speech and Hearing Research 


used when interpreted as an approxi- 
mation of the subject’s auditory thresh- 
old rather than the absolute threshold.’ 
On the contrary, the data suggest that, 
if well-controlled methodology is em- 
ployed, GSR audiometry is reliable and 
provides valid estimates of a person’s 
pure-tone thresholds. Data of the pres- 
ent study, however, may be safely 
generalized only to clinical and research 
applications of GSR audiometry with 
adults when systematic methods are 
used and when these methods are based 
on pertinent research findings. It can- 
not be emphasized too strongly that 
GSR audiometry has been demonstrated 
to be valid and reliable only for people 
who are good candidates for the test. 
Approximately 80% of the authors’ 
adult male clinic patients have been 
good candidates for GSR audiometry; 
that is, they have met a GSR condition- 
ing criterion prior to threshold sam- 
pling. 

The validity and reliability of condi- 
tioned GSR audiometry with children 
have not yet been subjected to experi- 
mental research. The need for research 
on GSR audiometry with children was 
emphasized recently by Grings, Lowell, 
and Rushford (18). It is difficult to 
estimate how well the reliability and 
validity data for adults apply to chil- 
dren, but it seems reasonable to assume 
that the data would apply to children 
mature enough to remain still and alert 
during GSR audiometry. Some infants 
and young children, particularly those 
who are active, may be poor subjects 


for GSR audiometry because they pro- 


duce an excessive number of random 
GSRs. This hypothesis seems to be sup- 
ported by data published recently by 

Grings, Lowell, and Honnard (17). 


Strauss’s (30) investigation of the use 
of mephenesin carbamate (a skeletal 
relaxant) during GSR and conventional 
audiometry could be replicated profit- 
ably with children. Strauss found that 
mephenesin carbamate had no side 
effects and did not affect GSR or vol- 
untary pure-tone thresholds. If me- 
phenesin carbamate, or a similar drug, 
could be used with children a method 
would be available for reducing motor 
activity in children during GSR audi- 
ometry. Conditioned GSR audiometry 
with children demands constant atten- 
tion to the question of validity. Fre- 
quent control presentations and a 
threshold criterion based on the relation 
between responses to tones and controls 
probably should be included in all GSR 
audiometry, but the need is even more 
critical with children. 

It is unfortunate that the term ‘GSR 
audiometry’ has come to connote a 
specific test that is used in a standard- 
ized manner by audiologists (see Nober 
quote above). At this time it appears 


“that neither GSR instrumentation (5, 


19, 17, 28) nor methodology (5, 14, 
19, 23, 29) approach standardization. 
Until such standardization occurs it 
will be necessary for audiologists and 
other professional workers to evaluate 
the results of GSR audiometry in rela- 
tion to the particular GSR instrumenta- 
tion and methodology employed. 


Summary 


The reliability of conditioned GSR 
pure-tone audiometry was evaluated by 
a test-retest technique. Subjects were 
41 adult males, 23 with normal hearing 
and 18 with hearing loss caused by 
otosclerosis. They received two GSR 
tests spaced approximately one month 
apart. Voluntary pure-tone thresholds 











=-Co 8 = oc 


Q @& 


= A -\ nN 


ct 


| 





ee 


| OT i Be 








Chaiklin, Ventry, Barrett: GSR Audiometry Reliability 279 


were obtained immediately before and 
immediately after each GSR test. All 
tests were conducted at 1000 cps. A 
systematic conditioning and threshold 
sampling procedure was utilized during 
GSR audiometry. This procedure in- 
volved specific response and threshold 
criteria. 

The results suggest that conditioned 
GSR audiometry is reliable when con- 
ducted under well controlled condi- 
tions with suitable candidates. A total 
of 39 subjects (95.1%) had retest GSR 
pure-tone thresholds within +5 db of 
their first GSR thresholds. Only two 
subjects (4.9%) had a +10 db differ- 
ence from one GSR test to another. 
These results agree well with the +5 
db test-retest difference accepted as 
normal variability for voluntary pure- 
tone audiometry. Data were also ob- 
tained confirming previous research 
findings on the high validity of GSR 
audiometry. Caution is suggested in 
generalizing from the results obtained 
in this study to the results of GSR 
audiometry that utilizes other pro- 
cedures or other subject populations. 


Acknowledgment 

Miss Gretchen A. Skalbeck, presently 
at the Audiology and Speech Pathology 
Clinic, VA Regional Office, Seattle, 
Washington, contributed to the plan- 
ning of this study and did several of 
the early tests of otosclerotic subjects. 


References 


1. American Stanparps AssociATION, Amer- 
ican Standard Specification for Audi- 
ometers for General Diagnostic Purposes. 
Rep. No. Z24.5-1951, Sect. 3.5. 

2. Aronson, A. E., Hinp, J. E., and Irwin, 
J. V., GSR auditory threshold mecha- 
nisms: effect of tonal intensity on ampli- 
tude and latency under two tone-shock 
intervals. J. Speech Hearing Res., 1, 1958, 
211-219. 


3: 


11. 


13. 


14. 


15. 


16. 


Barrett, L. S., Threshold relationships in 
simulated hearing loss. Ph.D. dissertation, 
Stanford Univ., 1959. 


. Brrrerman, M. E., Reep, P., and Kraus- 


Kopr, J., The effect of the duration of the 
unconditioned stimulus upon condition- 
ing and extinction. Amer. J. Psychol., 65, 
1952, 256-262. 


. Borptey, J. E., and Harpy, W. G., A 


study in objective audiometry with the 
use of a psychogalvanometric response. 
Ann. Oto. Rhino. Laryng., 58, 1949, 751- 
760. 


. Borotey, J. E., Harpy, W. G., and Ricx- 


Ter, C. P., Audiometry with the use 
of galvanic skin-resistance response; a 
preliminary report. Bull. Johns Hopk. 
Hosp., 82, 1948, 569. 


. Burx, K. W., Traditional and psycho- 


galvanic skin response audiometry. J. 
Speech Hearing Res., 1, 1958, 275-278. 


. Carnart, R., and Jercer, J. F., Preferred 


method for clinical determination of 
pure-tone thresholds. J. Speech Hearing 
Dis., 24, 1959, 330-345. 


. Coox, S. W., and Harris, R. E., The 


verbal conditioning of the galvanic skin 
reflex. J. exp. Psychol., 21, 1937, 202-210. 


. Davis, H., and Sirvermany, S. R., (Eds.) 


Hearing and Deafness. (rev. ed.) New 
York: Rinehart and Winston, 1960. 
Doerrter, L. G., and Kramer, Joan C., 
Unconditioned stimulus strength and the 
galvanic skin response. J. Speech Hearing 
Res., 2, 1959, 184-192. 


. Dorerriter, L. G., and McCtrure, Catu- 


ERINE T., The measurement of hearing 
loss in adults by galvanic skin response. 
J. Speech Hearing Dis., 19, 1954, 184-189. 
Giotas, Marityn H., Intensity generali- 
zation in clinical galvanic skin response 
audiometry. Asha, 2, 1960, 325. (Ab- 
stract) 

Gotpstetn, R., Effectiveness of condi- 
tioned electrodermal responses (EDR) in 
measuring pure-tone thresholds in cases 
of non-organic hearing loss. Laryngo- 
scope, 66, 1956, 119-130. 

Grant, D. A., Mever, D. R., and Hake, 
H. W., Proportional reinforcement and 
extinction of the conditioned GSR. J. 
gen. Psychol., 42, 1950, 97-101. 

Grant, D. A., and Scuneiper, DorotHy 
E., Intensity of the conditioned stimulus 
and strength of conditioning: II. The 
conditioned galvanic skin response to an 
auditory stimulus. J. exp. Psychol., 39, 
1949, 35-40. 








280 Journal of Speech and Hearing Research 


17. Grincs, W. W., Lowett, E. L., Hon- 
narD, R. R., Electrodermal responses of 
deaf children. J. Speech Hearing Res., 3, 
1960, 120-129. 


18. Grincs, W. W., Lowett, E. L., and 
Rusurorp, Grorcina M., Role of condi- 
tioning in GSR audiometry with chil- 
dren. J. Speech Hearing Dis., 24, 1959, 
380-390. 


19. Hino, J. E., Aronson, A. E., and Irwin, 
J. V., GSR auditory threshold mech- 
anisms: instrumentation, spontaneous re- 
sponse and threshold definition. J. Speech 
Hearing Res., 1, 1958, 220-226. 


20. Humpureys, L. G., Extinction of con- 
ditioned psychogalvanic responses follow- 
ing two conditions of reinforcement. J. 
exp. Psychol., 27, 1940, 71-75. 


21. JaHopa, Marz, Deutscu, M., and Cook, 
S. W., Research Methods in Social Re- 
lations, part one: Basic Processes. New 
York: Dryden, 1951. 


22. Kimste, G. A., Conditioning as a func- 
tion of the time between conditioned and 
unconditioned stimuli. J. exp. Psychol., 
37, 1947, 1-15. 

23. Meritser, C. L., and Doerrier, L. G., 
The conditioned galvanic skin response 
under two modes of reinforcement. J. 
Speech Hearing Dis., 19, 1954, 350-359, 


B& United World Films (1445 Park Ave. 
New York 29) is the distributor of a new 
United States Information Service motion 
picture, ‘Growth of a Language, which 
traces the development of English and gives 


24. Mortter, G., The CS-UCS interval in 
GSR conditioning. J. exp. Psychol., 48, 
1954, 162-166. 

25. Newsy, H. A., Audiology; Principles 
and Practice. New York: Appleton-Cen- 
tury-Crofts, 1958. 

26. Noser, E. H., GSR magnitudes for dif- 
ferent intensities of shock, conditioned 
tone and extinction tone. J. Speech Hear- 
ing Res., 1, 1958, 316-324. 

27. Spence, K. W., and Norris, Evcenta B., 
Eyelid conditioning as a function of the 
inter-trial interval. J. exp. Psychol., 40, 
1950, 716-720. 

28. Srewart, K. C., A new instrument for 
detecting the galvanic skin response. J. 
Speech Hearing Dis., 19, 1954, 169-173. 

29. Srewart, K. C., Some basic considera- 
tions in applying the GSR technique to 
the measurement of auditory sensitivity. 
J. Speech Hearing Dis., 19, 1954, 174-183. 

30. Srrauss, R. B., Premedication in clinical 
audiometry. A.M.A. Arch. Otolaryng., 
67, 1958, 354-363. 

31. Wuire, C. T., and Scurossere, H., Degree 
of conditioning of the GSR as a function 
of the period of delay. J. exp. Psychol., 
43, 1952, 357-362. 

32. Wirtinc, E. G., and Hucuson, W., 
Inherent accuracy of a series of re- 
peated clinical audiograms. Laryngoscope, 
50, 1940, 259-269. 


B RESEARCH NEWS NOTE 


illustrations of the differences the ear detects 
in this language as it is spoken in various 
countries. The film runs approximately 20 
minutes and is listed among government films 
for schools and colleges. 








1S 
0 





Identification of Stuttering 
during Relatively Fluent Speech 


RONALD W. WENDAHL 


JANE COLE 


Historically, investigators, interested in 
the specification of speech, proceeded 
to segment the continuous process into 
various units much in the same way as 
is now customarily done when speech 
is transcribed with phonetic symbols. 
While it was recognized that the proc- 
ess of segmentation was arbitrary and 
that there were obvious interactions be- 
tween combinations of phonemes, it 
was almost tacitly assumed that the 
interactions provided little more than 
redundant information. Consequently, 
research was directed toward the iden- 
tification of the so-called invariant as- 
pects of phonemes as determined at a 
‘single moment’ in time. 


While the segmentation of speech 
was obviously necessary for purposes 
of taxonomy and acoustic analysis, re- 
searchers became increasingly aware of 
the need to explore and to describe the 





Ronald W. Wendahl (Ph.D., University 
of Iowa, 1957) is Research Director, Houston 
Speech and Hearing Center, Clinical As- 
sistant Professor, Department of Otolaryngol- 
ogy, Baylor Medical School, and Clinical 
Associate Professor of Psychology and 
Speech, University of Houston. Jane Cole 
(MS, Vanderbilt University School of 
Medicine, 1960) is Speech Clinician, Hot 
Springs Rehabilitation Center. This work 
was supported in onl by the Louise and 
Gustavus Pfeiffer Foundation and the Bill 
Wilkerson Hearing and Speech Center. 


Volume 4, No. 3 281 


kinematic aspects of speech, and 
research efforts turned from the fre- 
quency domain toward the time do- 
main. As a result, it is now well 
accepted that one cannot specify speech 
in terms of invariant aspects of isolated 
phonemes. In the field of stuttering, 
investigators have recognized the de- 

’ sirability of describing more about the 
speech behavior of the stutterer than 
the theoretical ‘moment of stuttering.’ 
To look only at the ‘block,’ at an in- 
stant in time, is analogous to the seg- 
mentation of speech into phonemes and 
may not reveal the existence of inter- 
actions between relatively fluent and 
disfluent speech.. 


As long as the speech pathologist 
proceeds on the philosophical assump- 
tion that stuttering can be analyzed by 
focusing attention on the ‘block,’ 
therapy will be directed toward the 

, goal of helping the stutterer learn to 
control or handle the ‘block.’ The in- 
dividual stutterer can be taught to 
speak in a way which will allow him 
to control or vary overt stuttering but, 

vas Williams (1) wrote, ‘. . . to him 
his stuttering is a constant, an entity, 
a something that still remains. There- 
fore, even if he reduced to a considera- 
ble extent the overt activity of tensing, 
holding of breath, etc., “it” is still there, 


September 1961 








“ 


282 Journal of Speech and Hearing Research 


and he feels he must control it at all 
times or it is liable to appear again.’ 

The philosophy which directs atten- 
tion to the total process of speaking 
implies that the stutterer is doing 
‘things’ which interfere with his for- 
ward progress of speech at times other 
than during the ‘actual block.’ Williams 
(1) has stated that the stutterer does 
these ‘things’ two or three words before 
he reaches a point at which it is gen- 
erally considered that he is ‘beginning’ 
to stutter. 

If it is true that a stutterer does 
things which interfere with his speech 
prior to the ‘moment of stuttering,’ 
there should be techniques by which 
one can measure and quantify them. 
Further, should such sufficient cues 
exist, one may have cause to reconsider 
the meaningfulness of talking in terms 
of ‘moments of stuttering.’ 

This study represents an attempt to 
determine whether there are sufficient 
cues in the nonstuttered speech imme- 
diately preceding and succeeding the 
‘moment of stuttering’ to differentiate 
stutterers from nonstutterers. The pur- 
pose of this study was, therefore, (a) 
to determine whether a selected panel 
of judges would be able to identify 
stutterers from nonstutterers when pre- 
sented with tape recorded speech sam- 
ples from which the overt behavioral 
units ordinarily considered to represent 
‘moments of stuttering’ had been ex- 
perimentally deleted, and (b), if this 
proved possible, to investigate the role 
played by rate, rhythm, and force as 
determinants of judged stuttering be- 
havior. 


Procedure 


Subjects. Eight adult male stutterers 
were matched with eight nonstutterers 


for age (plus or minus one year) and 
for reading proficiency. 


Judging. Judges for the study were 
86 undergraduate students in psychol- 
ogy, untrained with respect to evalua- 
tive and therapeutic procedures with 
stutterers. For purposes of judging, the 
study was divided into four parts and 
each judge was assigned to serve in 
one part and one part only. The num- 
ber of judges and the judgments asked 
in each part were as follows: Part 1, 
30, stuttering; Part 2, 18, rate; Part 3, 
19, force or strain; Part 4, 19, rhythm. 
Each group of judges listened independ- 
ently to the same speech samples played 
through a loudspeaker. 


Tape Recording. Each of the 16 sub- 
jects read a passage graded for the third 
year reading level, first silently and 
then orally. The oral reading was tape 
recorded. 


Tape Editing. Four sentences for 
each stutterer were selected from the 
reading passage. Each of the sentences 
was edited in the following manner: all 
instances of utterances containing 
sound, syllable, phrase, or word repeti- 
tions, prolongations, interjections, or 
unusual stress patterns were cut from 
the tape; in place of each such dis- 
fluency a two-second silent interval of 
blank tape was inserted. The tapes for 
each nonstutterer, previously matched 
with those for each stutterer, were 
similarly edited so that for each matched 
pair of speakers the same words were 
deleted and the same duration of silence 
inserted. In addition, any reading error, 
revision, mispronunciation, or similar 
disfluency was deleted from both sets 
of sentences for each matched pair. 

An attempt was made to insure that 
each sample contained at least eight 











Wendahl, Cole: Stuttering Identification 283 


words. Within the confines of this 
study it was not possible to hold rigidly 
to these criteria in 16 of the 64 sen- 
tences because of the location of the 
disfluencies. Therefore, the range in 
number of words in the samples was 
from five to 22 with a mean length of 
12.5. A deviation from the requirement 
that all ‘moments of stuttering’ be pre- 
ceded and followed by more or less 
fluent speech also was necessary. In 
one case the disfluency and, therefore, 
the silent interval, occurred on the last 
word. In five cases two ‘moments of 
stuttering’ were separated by only one 
word and the separating word was 
eliminated along with the disfluencies 
because it was felt that so short a 
speech segment might be misunder- 
stood, therefore misinterpreted. 


The 64 sentences, four for each stut- 
terer and four identical sentences for 
each matched nonstutterer, were tran- 
scribed on a master tape so that, for 
example, where subject one and subject 
two were paired, sentence A of subject 
one always preceded sentence A of 
subject two; sentence B of subject one 
always preceded sentence B of subject 
two; and so on, until the four samples 
for each pair of subjects had been 
ordered. Although within any given 
pair of subjects the same speaker’s re- 
sponses always appeared first, the order 
of stutterer and nonstutterer was ran- 
domly varied between pairs of subjects. 


Presentation of Samples to Judges. 
Each of the four groups of judges lis- 
tened independently to the same speech 
samples played through a loudspeaker. 
Instructions for all four groups began 
with these two paragraphs: 


You will hear a series of speech sam- 
ples from the recorded readings of 16 


speakers. The speakers are paired so that 
speaker one and speaker two read the 
same words, speaker three and speaker 
four the same words, and so on. 


For experimental purposes certain words 
have been deleted and a two-second in- 
terval of silence has been inserted. 


Part 1. Additional instructions for the 
judges in Part 1 (stuttering, nonstutter- 
ing) were as follows: 


You are to make certain judgments con- 
cerning stuttering. In this test, either 
speaker in a pair, neither speaker in a 
pair, or both speakers in a pair may be 
classified as a stutterer. Mark S for each 
sample you consider to have been spoken 
by a stutterer and N for each sample you 
consider to have been spoken by a non- 
stutterer. 


Part 2. Additional instructions for the 
judges in Part 2 (rate of speaking) were 
as follows: 


You are asked to make judgments about 
the rate of speaking used by the indi- 
viduals you will hear. Place a check 
mark on the blank according to which of 
each pair of speakers is using a more 
normal speaking rate. 


Part 3. Additional instructions for the 
judges in Part 3 (force or strain) were 
as follows: 


You are asked to make certain judg- 
ments concerning the amount of force 
(or strain) in the voices of each of the 
speakers. Place a check mark on the 

ccording to which of the pair of 
speakers you feel is using more force or 
strain while speaking. 


Part 4. Additional instructions for the 
judges in Part 4 (rhythm) were as fol- 
lows: 


You are asked to make certain judg- 
ments concerning the rhythm patterns of 
each of the speakers. Place a check mark 
in the appropriate blank of your answer 
sheet for the one of each pair of speak- 
ers that is using a more normal rhythm 
pattern. 








284 Journal of Speech and Hearing Research 


120 + 


10 + 


100 + 


80 + 
70 + 
60 + 
50 + 
40 + 


30 + 


JUDGED FREQUENCY OF STUTTERING 





20 4 








PAIRED SUBJECTS 


Figure 1. Total number of speech samples 
judged as stuttered by eight stutterers (solid 
line) and a matched group of eight nonstutter- 
ers (broken line). 


Results 
From the cumulative number of times 


that stuttering and nonstuttering sub- 
jects were judged to be stutterers, it is 


70 + 


60 
55 
50 
45 3 
40 


35 4 





30 + 
25 4 - 


20 + \ 


JUDGED INSTANCES OF POOR RATE OF SPEAKING 








! 2 3 4 5 6 7 8 
PAIRED SUBJECTS 


Ficure 2. Total number of stutterers (solid 
line) and nonstutterers (broken line) judged 
to have a poor rate of speaking. 


apparent (Figure 1) that the speech of 
the stutterers was considered different 
from that of the nonstutterers. A t test 
for independent means established that 
the stutterers were significantly differen- 
tiated from the nonstutterers (t = 4.17; 
df = 143: to: = 2.98). 

Analysis of the pooled data by ¢ tests 
for independent measures indicated the 


60 + 
55 + 
50 + 
45 + 
40 + 
35 + 


30 + 


INSTANCES OF EXCESSIVE FORCE 








25 bs 
Ruveee 
\- 
20 + ‘. 
‘’ 
IS + 
a \ ag 
w . , 
9°10 + % ’ 
ro) . ° 
= . ~ 
5 4 9 
4 + 4 + + t t 
| 2 3 4 5 6 7 8 


PAIRED SUBJECTS 


Ficure 3. Total number of stutterers (solid 
line) and nonstutterers (broken line) judged 
to be using excessive force (strain) in speak- 


ing. 


stutterers exhibited significantly poorer 
rates of speaking (t = 2.44; df = 14; 
tos = 2.14); spoke with greater force 
or strain (t = 6.74); and were judged 
to use less rhythmical speech (¢ = 
4.22) than the nonstutterers (Figures 


2, 3, and 4). 


From visual inspection of the data it 
is apparent that the stutterers tended to 
maintain their relative ranking on each 
of the four parts of the study while 
individuals in the control group evi- 








Wendabl, Cole: Stuttering Identification 285 





JUDGED INSTANCE OF POOR RHYTHM 
ow 
a 








' 2 3 4 5 6 7 8 
PAIRED SUBJECTS 


Figure 4. Total number of stutterers (solid 
line) ‘and nonstutterers (broken line) judged 
to have poor rhythm in speaking. 


denced large variability between meas- 
ures. Analysis by the Kendall coefficient 
of concordance showed a W of .86 for 
the stutterers and .45 for the normal 
speakers. 


Discussion 


The results of this study tend to sup- 
port the hypothesis that the stutterer 
is not speaking normally up to the 
point in time at which the stuttering 
behavior is presumed to begin. They 
indicate, further, at least some of the 
ways in which he differs from normal 
speakers. It is not suggested that these 
are the only ways in which he differs. 
Instead, the writers believe that a more 
important variable may be operating 
than was tested in this study. This hy- 
pothesis is suggested from the data of 
nonstuttering control subjects five and 
six. Speaker five was judged to be a 
stutterer more often than five of the 
stutterers and certainly more often than 


any of the controls, yet the data from 
judgments on his rate, force, and 
rhythm do not show any extremely 
high deviations. Control subject six re- 
ceived what could be considered good 
ratings for rate, force, and rhythm, 
and yet the number of times he was 
labeled a stutterer was exceeded only by 
the number of these judgments for 
normal control subject number five. 


From the coefficients of concordance 
and from visual inspection of the data, 
it is suggested that the normal speakers 
have wide latitude of disfluencies before 
they will be labeled as stutterers. 


Williams: has stated, “The extent to 
which the stutterer is able to do more 
things that normal speakers do, is the 
extent to which he will come more 
and more to speak like a normal speaker. 
First, however, he must become thor- 
oughly familiar with what a normal 
speaker does when he talks.’ On the 
basis of the Williams theory and the 
findings of this study, which are in 
high agreement, it is suggested that if 
a speech therapy program is to be ade- 
quate it must be directed toward the 
total process of speech. It will be in- 
adequate if it is limited to a considera- 
tion of only the ‘stuttering block.’ 


Summary 


The purpose of the present study was 
(a) to determine if naive listeners could 
distinguish stutterers from nonstutterers 
by listening to their tape-recorded 
speech from which obvious disfluencies 
had been removed and (b) to deter- 
mine if, from the edited tape recording, 
the listeners would judge the stutterers’ 


*D. E. Williams, personal communication. 


eee 


en oe ors 





286 Journal of Speech and Hearing Research 


speech as having poorer rate, showing 
more force or strain, and poorer 
rhythm patterns than the speech of 
nonstutterers. The results indicate that 
(a) stutterers are easily differentiated 
as a group even on the basis of such 
edited recordings and (b) stutterers 
have a poorer rate of speech, use more 
force or strain, and have less rhyth- 
mical speech patterns than nonstut- 
terers. It is suggested that a therapy 
program with stutterers must consider 
speech as a whole and that it is not 
sufficient to work toward the goal of 
modifying the ‘stuttering block.’ 


Acknowledgement 


Acknowledgement is hereby given to 
Dr. Dean E. Williams for his help in 
preparing the manuscript and for his 
suggestion of the theoretical frame of 
reference which guided the purpose of 
the study. Acknowledgement is also 
given to Mary Johnson, Linda Lyons, 
and Mary Talley for the pilot work on 
sections two, three, and four. 


Reference 
1. WituraMs, D. E., A point of view about 


‘stuttering.’ J. Speech Hearing Dis., 22, 
1957, 390-397. 








. 


ee a er ee ee ee Fa ee ae) ee ee ee a a a a a ee 


= — 


a (nA =S fr da 








Intellectual Impairment in Children with Cleft Palates 


LEONARD D. GOODSTEIN 


The clearly obvious nature of the hand- 
icap in children with cleft lips and 
palates, particularly the difficulties of 
these children in oral communication, 
often leads to an initial impression that 
they are intellectually impaired. Indeed, 
a study of the early development of 
typical cleft palate children would pro- 
vide considerable support for such an 
impression. Their development is com- 
plicated by feeding problems, frequent 
infections, and the many difficulties as- 
sociated with the medical management 
of the cleft itself, all of which may in- 
volve long periods of hospitalization. 
Such hospitalization, coupled with the 
overprotective attitudes of the parents, 
associated difficulties in hearing, and 
other related problems, represent com- 
mon etiological factors in intellectual 
retardation. Further, the generally poor 
quality of the speech of these children 
would presumably lead to fewer re- 
wards for speaking with a subsequent 
retardation in language skills and in- 
telligence. 

While there is rather clear evidence 
that children with cleft palates have im- 
paired articulation skills (10), language 





Leonard D. Goodstein (Ph.D., Columbia 
University, 1952) is Associate Professor of 
Psychology and Director, University Coun- 
seling Service, University of Iowa. This in- 
vestigation was supported in part by Research 
Grant M-1158 from the National Institute of 
Mental Health, United States Public Health 
Service. 


Volume 4, No. 3 


287 


skills (9), and communication skills (6), 
the evidence on intellectual impairment 
is scanty and incomplete. The two pub- 
lished research studies known to this 
writer (3, 7) that were specifically con- 
cerned with the intelligence of persons 
with cleft palates are methodologically 
deficient in that both failed to include 
any matched control group, each in- 
stead comparing obtained results with 
published test norms. Billig (3) re- 
ported the Intelligence Quotients (IQs) 
of 60 children with cleft palates, aged 
two months to 17 years, ranged from 64 
to 130 with a mean of 94. Billig’s find- 
ings are difficult to interpret, however, 
because they are based upon eight dif- 
ferent measures of intelligence, each of 
which has somewhat different statistical 
characteristics. Munson and May (7) 
reported a mean IQ of 96 for 151 per- 
sons with cleft palates of unspecified 
ages, based upon two different intelli- 
gence tests, Both of these studies pro- 
vide data which suggest that while the 
distribution of IQs in persons with cleft 
palates is approximately normal, there 
is some skewness toward the lower end 
of the intelligence continuum. Thus ap- 
parently neither the degree nor the 
specific nature of intellectual impair- 
ment in children with cleft palates has 
been determined. 


The purpose of the present- study is 
to evaluate the degree and character- 
istics of intellectual impairment in a 


September 1961 








288 Journal of Speech and Hearing Research 


representative sample of children with 
cleft palates by comparing their per- 
formance on an individual, standardized 
intelligence test with that of a matched 
control group of physically normal 
children. It is hoped that the data pre- 
sented in this report will be useful both 
in a theoretical understanding of chil- 
dren with cleft palates and in practical 
planning of therapeutic and rehabilita- 
tion programs. 


Procedure 


As part of a large-scale, intensive in- 
vestigation of the factors related to the 
adjustment of children with cleft lips 
and palates, the voluntary participation 
of 175 Caucasian families, with children 
ranging in age from newborns to 16 
years and having a cleft lip or palate or 
both, was obtained. These experimental 
families were drawn from seven mid- 
western states, primarily Iowa and Illi- 
nois. All participating families agreed to 
have the mother accompany the child 
to the University of Iowa Medical 
Center for interview and study. Where 
appropriate, the expenses incurred in 
traveling to and from Iowa City were 
reimbursed. Under no circumstance was 
any charge made for the tests and other 
procedures involved. 


For comparative purposes, a control 
group of 175 families, matched to the 
experimental families on the basis of the 
child’s age, sex, and birth order as well 
as family size, rural-urban background, 
socioeconomic status, and religious af- 
filiation, was used. The voluntary co- 
operation of the control families was 
obtained in six Iowa and one Minnesota 
communities through the assistance of 
superintendents of schools, county 
nurses, ministers, and other public of- 


ficials who contributed the names of 
prospective participants. Following an 
initial contact by a member of the re- 
search team, the interviews and other 
procedures were completed in the local 
communities, using school, courthouse, 
and church facilities for the testing and 
interviewing. Each control family re- 
ceived $10, plus transportation expenses, 
for participating in the study. 

The Wechsler Intelligence Scale for 
Children (WISC) (11) was individual- 
ly administered by a trained examiner 
to all children in both the experimental 
and control groups who had reached 
their fifth birthday. WISC results were 
thus actually obtained from 105 experi- 
mental subjects (Ss) and 95 control Ss. 
Two of the total 175 experimental and 
three of the 175 total control Ss were 
untestable because of refusals, extreme 
fatigue, and other reasons. The remain- 
ing Ss were not tested because they had 
not reached their fifth birthday, the 
lower limit of the WISC standardization 
data. Prior to the initiation of the pres- 
ent study, it was decided not to test 
these younger children because of the 
difficulties in obtaining reliable IQs 
from such young children (2) as well 
as the difficulties in finding a suitable 
test. 

The WISC is a carefully standardized, 
individually administered intelligence 
test composed of 12 parts or subtests 
that yield three intelligence quotients 
(IQ): a Verbal IQ, a Performance IQ, 
and a Full Scale IQ. The Verbal IQ is 
based upon the child’s performance on 


_ the six language subtests, such as defin- 


ing words and solving arithmetic prob- 
lems, while the Performance IQ is based 
upon his performance on six relatively 
nonverbal subtests, such as arranging 
pictures to tell a story and writing num- 











Goodstein: Intellectual Impairment in Cleft Group 289 


bers in a substitution task; the Full Scale 
IQ is essentially an average of the Per- 
formance and Verbal IQs. Each of the 
three WISC IQs has been standardized 
with a mean of 100 and a standard devi- 
ation of 15 points. 


In her summary of the significant 
factors related to the intelligence of 
children, Anastasi (1, pp. 274-276) 
noted that substantial positive correla- 
tions (approximately .50) have been 
consistently reported between parent 
and child intelligence test scores. A 
measure of parental intelligence was in- 
cluded in the present study so that the 
intelligence of the experimental and 
control children could be compared and 
evaluated with the intelligence of the 
parents held statistically constant. The 
First Civilian Edition of the Army Gen- 
eral Classification Test (AGCT) (5) 
was thus administered as an additional 
control measure to all parents of chil- 
dren who had taken the WISC. The 
AGCT, a 150 item, virtually self-admin- 
istering, paper-and-pencil test of adult 
intelligence, yields a single standard 
score based upon a mean of 100 and a 
standard deviation of 20. A total of 103 


of the 105 experimental mothers and 94 
of the 95 control mothers actually com- 
pleted the test while the number of 
fathers was 91 and 85, respectively. The 
remainder were either unavailable or re- 
fused to cooperate, or there was insuf- 
ficient time to complete the test. 

To further evaluate the behavior of 
these children, in particular their social 
competence, each mother was personal- 
ly interviewed by a trained clinician 
who then scored the Vineland Social 
Maturity Scale (VSMS) (4) according 
to the information gathered in the inter- 
view. In this procedure the examiner 
obtains as much detail as possible about 
the child’s ‘actual and habitual perform- 
ance’ in several areas of social compe- 
tency, such as self-help in eating, 
communication, etc., and then scores 
this reported performance against an 
objective, standardized schedule of nor- 
mal development. The VSMS yields a 
single Social Quotient (SQ) which is 
obtained in the same manner as an age- 
scale IQ; for children of the age levels 
involved in this study the expected 
mean SQ is 100 with a standard devia- 
tion of approximately 12 points (4, pp. 
364-381). 


Taste 1. Means, adjusted means, and standard deviations of the several test measures for the 
experimental and control groups together with the significance level of the difference between 











the means. 
Measure Total Experimental Group Control Group Significance 
M AdjM SD N M AdjM SD N p 
WISC Verbal IQ 915 92.5 144 105 103.0 102.7 12.7 95 01 
WISC Performance IQ 97.9 986 14.8 105 104.6 1041 12.1 95 01 
WISC Full Scale IQ 94.0 949 14.5 105 lez 1086 HS 01 
Vineland SQ 
Ages 5 to 16 99.8 11.8 105 103.5 10.3 95 02 
Ages 0 to 5 96.7 17.1 34 108.5 15.6 79 01 
Mothers’ AGCT 107.0 164 103 110.4 14.0 94 ns 
Fathers’ AGCT 109.3 17.7 91 112.9 16.8 85 ns 














290 Journal of Speech and Hearing Research 


Taste 2. Distribution of Full Scale WISC IQs for the experimental and control groups 
together with the distribution to be expected in a random sample. 











1Q Intellectual Expected Control Experimental 
Classification Per Cent Per Cent Per Cent 

130 and Above Very Superior 2.2 0.0 0.0 
120 to 129 Superior 6.7 8.4 2.9 
110 to 119 Bright Normal 16.1 28.4 11.4 
90 to 109 Average 50.0 50.5 45.7 

80 to 89 Dull Normal 16.1 9.5 2537 

70 to 79 Borderline 6.7 2.1 8.6 

69 and Below Mental Defective 


22 1.1 5:7 








Results 


The means and standard deviations 
on all test measures for the experimental 
and control groups are presented in 
Table 1 together with the significance 
level of the differences between the 
group means as evaluated by f tests. 
(Also given in Table 1 are adjusted 
means resulting from analyses of covari- 
ance, to be discussed later.) 


For all three WISC IQs the control 
group means were reliably higher than 
those of the experimental group (ps < 
.01) indicating a statistically significant 
intellectual impairment in the cleft pal- 
ate group of at least 6 IQ points. - 


The extent of this impairment may 
also be seen in Table 2 where the distri- 
butions of Full Scale IQs in the two 
groups of children are presented to- 
gether with the percentages to be ex- 
pected at each level of intelligence in 
a random sample (1). The comparison 
of the two obtained distributions with 
each other and with the distribution of 
expected percentages by means of three 
2 X 7 chi-square tests indicated that the 
distribution of IQs for the experimental 
group was significantly (p < .01) dif- 


ferent from both that of the control 
group and that of the expected percent- 
ages; the latter two distributions, how- 
ever, were not significantly different 
from each other (p > .10). 


A further examination of Table 1 
indicates that the older control group 
children (those aged five to 16 for 
whom WISC IQs were available) had 
a significantly higher (p < .02) mean 
SQ on the Vineland Social Maturity 
Seale than the older experimental group 
children. The younger control group 
children (those under five who had not 
taken the WISC) also had a significant- 
ly higher (p < .01) mean SQ on the 
VSMS than the younger experimental 
group children. 


Table 1 also indicates that, while both 
the mothers and fathers of the control 
group children had slightly higher mean 
AGCT scores than the mothers and 
fathers of the experimental group par- 
ents, the obtained differences were not 
statistically significant (ps > .10). 


In an effort to reduce the within- 
groups variability associated with pa- 
rental intelligence, the WISC data were 
further analyzed by means of analyses 














Goodstein: Intellectual Impairment in Cleft Group 291 


Frequency (in %) 
r28 
a | 


‘ 
/ ‘ 
’ ‘ 
U 







Experimental 


Group ‘Sf 


4 
-—" 
) 





P>V 
Control Group 





-25 -20 -I5 -10 -§ 5 i 
Discrepancy (in IQ points) 


te) 1S 


Figure 1. Distribution of Verbal-Performance 
IQ discrepancy scores in the experimental and 
control groups. 


of covariance, using the mean parental 
AGCT scores for each child as the con- 
trol variable. The adjusted mean Ver- 
bal, Performance, and Full Scale IQ 
which resulted from these three anal- 
yses of covariance are presented in 
Table 1, along with corresponding un- 
adjusted means discussed above. While 
the mean differences between the ex- 
perimental and control groups were 
slightly reduced in all three cases, the 
appropriate F values from these analyses 
of covariance indicated that these dif- 
ferences were still highly significant 
(all ps < .01). Thus even with the ef- 
fects of parental intelligence held 
constant, the control group was sig- 
nificantly more intelligent than the ex- 
perimental group. 


Tasie 3. Means and standard deviations of the 
of experimental subjects. 


To further evaluate the nature of the 
obtained differences in IQ between the 
two groups, a ‘discrepancy score’ was 
computed for each child, the numerical 
difference between his Verbal IQ and 
his Performance IQ. Since the Perform- 
ance IQ was always subtracted from the 
Verbal IQ, the sign for P > V is minus 
and the sign for V > P is plus. The dis- 
tribution of these discrepancy scores for 
the experimental and control groups is 
graphically presented in Figure 1. 

It can be noted in Figure 1 that the 
distribution of discrepancy scores in the 
two groups was dissimilar, with the ex- 
perimental group having had more and 
higher negative discrepancy scores than 
the control group. A statistical com- 
parison of these two distributions of 
discrepancy scores by means of the chi- 
square test for a 2 x 12 table yielded 
highly significant results (p < .01). The 
statistical evaluation of such discrepan- 
cy scores in the WISC standardization 
sample leads to the expectation that 
positive and negative scores are equally 
frequent and that only about one-third 
of such scores will be larger than 12 to 
13 IQ points (8). These expectations 
were not met in the experimental group 
where 34% had large (at least 12 to 13 
IQ points) negative discrepancy scores 
and 9% had large positive discrepancy 


several test measures for the three subgroups 











Measure Lip and Palate Palate Only Lip Only 
Mean SD Mean SD Mean SD 
WISC Verbal IQ 93.2 14.5 74 85.6 13.8 23 92.8 11.0 8 
WISC Performance IQ 99.3 14.5 74 92.5 14.2 23 100.9 154 8 
WISC Full Scale IQ 95.7 14.5 74 87.7 12.5 23 96.3 13.8 8 
Vineland SQ 100.6 11.7 74 97.0 7 23 100.4 8.3 8 
Mothers’ AGCT 106.1 17.1 73 106.7 13.9 22 116.7 13.4 8 
Fathers’ AGCT 109.4 17.7 64 108.7 19.2 21 110.8 11.3 6 














292 Journal of Speech and Hearing Research 


scores. In the control group, however, 
these percentages were 20 and 18, re- 
spectively, which was relatively close 
to expectation. Thus it would appear 
that the experimental children were rel- 
atively more retarded in verbal intelli- 
gence than in performance intelligence 
in comparison with either the matched 
control group or the original WISC 
standardization group, 

In an effort to evaluate the relation- 
ship of type of cleft to intellectual im- 
pairment, the experimental children for 
whom WISC scores were available were 
divided into three subgroups deter- 
mined by the type of the handicap: 
cleft lip and palate (N = 74), cleft 
palate only (N = 23), and cleft lip 
only (N= 8). The means and standard 
deviations on the several test measures 
for these three subgroups of experi- 
mental Ss are presented in Table 3. 


With the Lip Only subgroup ex- 
cluded because of the small N, the dif- 
ferences between the means of the other 
two groups on each of the several test 
measures were evaluated by means of t 
tests. All three mean IQs of the Palate 
Only subgroup were significantly (all 
ps < .05) lower than the corresponding 
mean IQs of the Lip and Palate- sub- 
group. The differences on the VSMS 
and the AGCT were not statistically 
significant (all ps > .10). Thus those 
children who had only cleft palates 
were statistically significantly more im- 
paired in their intellectual development 
than were the children with both cleft 
lips and palates. 


Discussion 


The results of the present study 
clearly indicate that children with cleft 
lips and palates are significantly im- 


paired in their intellectual development. 
While the three WISC mean IQs reflect 
a general impairment of six to 11 IQ 
points, the extent of this intellectual 
impairment is best seen in the distribu- 
tion of Full Scale IQs in Table 2. Of 
the children with cleft lips and palates, 
40% have Full Scale IQs of 89 or less 
as contrasted to only 12.7% in the 
matched control group. While 25% of 
a randomly selected sample can be ex- 
pected to have IQs of 89 or less, the 
careful matching involved in selecting 
the control group suggests that in this 
case the control group provides the 
more legitimate expectation. In either 
case the obtained results with the ex- 
perimental group indicate that a mean- 
ingfully larger group of children with 
cleft lips and palates has Dull Normal 
or lower intelligence. At the other end 
of the intelligence continuum similar 
marked impairment of the experimental 
children can be noted with more than 
twice as many of the control children 
having IQs in the Bright Normal and 
Superior classifications (36.8 versus 
14.3%). 


It is clearly evident from the data in 
Table 1 and especially in Figure 1 that 
the intellectual impairment of these 
children with cleft lips and palates is 
most substantial in the area of verbal 
intellectual skills. Such a result is com- 
patible with the common-sense expecta- 
tion that the gross impairment of the 
speech mechanism in persons with cleft 
palates would make the learning of 
verbal intellectual skills difficult and 
more poorly rewarded. Additional evi- 
dence, as yet unavailable, on the rela- 
tionship of the severity of the handicap 
to the severity of the intellectual im- 
pairment might provide considerable 














Goodstein: Intellectual Impairment in Cleft Group 293 


support for this interpretation. It is im- 
portant to note, however, that the ob- 
tained evidence on the _ intellectual 
impairment of the experimental group 
has important practical implications in 
planning therapeutic and educational 
programs for these children. Many of 
these children in the lower three intel- 
lectual classification levels will have 
difficulty in assimilating ordinary edu- 
cational matter and can be expected to 
be at least somewhat retarded in their 
educational development. In making 
educational plans for such children on 
a longer range basis, collegiate training 
seems an unreasonable goal for many 
but certainly not all of the children 
inasmuch as Bright Normal intelligence 
is considered as minimal for such plan- 
ning. Because of the increased possi- 
bility that an individual child with a 
cleft may be intellectually impaired, 
individual intellectual assessment should 
be included in the planning stages of all 
total habilitative programs for such 
children. 

An inspection of the data in Table 1 
suggests that, while the experimental 
children are statistically significantly 
impaired in their social development, 
this impairment is apparently not as 
marked, especially in those children 
aged five and older, as the intellectual 
impairment. The mean SQ of the older 
experimental children is only 0.2 points 
lower than the expected SQ of 100 and 
is only 3.7 points lower than the mean 
of the control group. An inspection of 
the actual frequency distributions of 
the SQs for these two groups (these 
data not included in the interests of 
saving space) strongly supports this 
impression. Only two of the experi- 
mental group and one of the control 
group have SQs below 80. Since the 


differences in the younger children are 
larger than those for the older children, 
it may be argued that while the experi- 
mental children are younger they are 
more socially retarded, perhaps because 
of overprotection by anxious parents, 
but this retardation is reduced with in- 
crease in age. Confirmation of this hy- 
pothesis would require a follow-up 
study of the children in the experimen- 
tal group using the VSMS to evaluate 
these changes. 


It is somewhat surprising to note that 
those children who have cleft palates 
only were significantly more intellectu- 
ally impaired than those children who 
had both cleft lips and palates. This 
finding is partially explained by an anal- 
ysis of the clinical records of the two 
groups, Of the group of children with 
cleft palates only, 12 of 23 (52%) had 
one or more other physical anomalies 
(heart disorders, hernias, etc.) reported 
by the examining physician while only 
13 of the 74 (16%) children with both 
cleft lips and palates had such multiple 
anomalies. Thus it would seem that the 
cleft palate only group was indeed the 
more physically impaired group as well 
as also being the more intellectually im- 
paired. The present data do not give 
any clue to the underlying reasons for 
the more frequent appearance of multi- 
ple anomalies in the children with cleft 
palates only and additional research is 
indicated. 


* 


Summary 


In order to evaluate the degree and 
the characteristics of intellectual im- 
pairment in children with cleft lips and 
palates, the Wechsler Intelligence Scale 
for Children (WISC) was individually 
administered to a representative sample 








294 Journal of Speech and Hearing Research 


of 105 children, aged five to 16 years, 
all of whom had a cleft lip or palate or 
both, and also to a matched control 
group of 95 children. All three mean 
IQs (Verbal, Performance, and Full 
Scale) obtained by the experimental 
group of children with cleft lips and 
palates were significantly lower (all ps 
< .01) than the same mean IQs ob- 
tained by the control group. The extent 
of these differences in intellectual im- 
pairment and the primarily verbal na- 
ture of the deficits were described in 
some detail together with some of the 
implications of these findings for edu- 
cational and habilitative work with 
these children. 


Acknowledgment 


The author is indebted to Mrs. Bette 
Spriestersbach and Mrs. Jean Parker, 
who served as the principal psycho- 
logical examiners during the conduct of 
the study, and to Dr. Dee W. Norton, 
Dr. Gene R. Powers, and Robert F. 
Stanners for their assistance with vari- 
ous portions of the statistical analyses. 
The author is especially indebted to Dr, 
D. C. Spriestersbach, Principal Investi- 
gator of Grant M-1158, for his continu- 
ing assistance and support throughout 
the course of this investigation. 


References 


1. Anastast, ANNE, Differential Psychology, 
Individual and Group Differences in Be- 
havior. (3rd ed.) New York: Macmillan, 
1958. 

2. Baytey, Nancy, Consistency and varia- 
bility in the growth of intelligence from 
birth to eighteen years. J. genet. Psychol., 
75, 1949, 165-196. 

3. Buu, A. L., A psychological appraisal 
of cleft palate patients. Proc. Pa Acad. 
Sci., 25, 1951, 29-32. 

4. Dott, E. A., The Measurement of Social 
Competence; A Manual for the Vineland 
Social Maturity Scale. Minneapolis: Educ. 
Test Bur., Educ. Pub., 1953. 

5. Examiner Manual for the Army General 
Classification Test. (1st Civilian ed.) 
Chicago: Science Res. Assoc., 1947. 

6. Morris, H. L., Communication skills of 
children with cleft lips and palates. Ph.D. 
dissertation, Univ. Iowa, 1960. 

7. Munson, S. E., and May, Anna M., Are 
cleft palate persons of sub-normal intelli- 
gence? J. educ. Res., 48, 1955, 617-621. 

8. SeAsHorE, H. G., Differences between 
verbal and performance IQs on _ the 
Wechsler Intelligence Scale for Children. 
J. consult. Psychol., 15, 1951, 62-67. 

9. SpriesTersBacn, D. C., Dartey, F. L., and 
Morris, H. L., Language skills in children 
with cleft palates. J. Speech Hearing Res., 
1, 1958, 279-285. 

10. SpriestersBaAcu, D. C., Dartey, F. L., and 
Rouse, Verna, Articulation of a group of 
children with cleft lips and palates. J. 
Speech Hearing Dis., 21, 1956, 436-445. 

11. Wecuster, D., Manual for the Wechsler 
Intelligence Scale for Children. New 
York: Psychol. Corp., 1949. 








Letters to the Editor 


The Editorial Staff assumes no responsibility for the 
opinions expressed in Letters to the Editor. 


Comment on ‘Dimensions 
of Language Performance 
in Aphasia’ 


We welcome the opportunity to comment 
on the study by Jones and Wepman concern- 
ing dimensions of language performance in 
aphasia. Jones and Wepman’* conclude that 
their findings are ‘in sharp contrast’ to those 
previously reported by us (3). It appears to 
us that they have misinterpreted our findings. 
We, therefore, should like to clarify our 
hypothesis and to suggest that the two studies, 
theirs and ours, complement each other to 
provide a comprehensive description of 
aphasic performance. 


We concluded our data did not support 
classical descriptions of aphasia which are 
based on a sensory and motor dichotomy, or 
isolation of ‘pure’ defects, each as alexia, 
agraphia, and acalculia. Our study pointed 
out that such logical categories were not 
empirically determined, and demonstrated 
that analysis of data, obtained from perform- 
ance of a large sample of aphasic subjects on 
a substantial battery of tests, was compatible 
with the hypothesis of a single dimension of 
deficit crossing all language modalities. We 
welcome the data-oriented approach of Jones 
and Wepman, and are gratified that their 
extensive research, utilizing a different test 
battery, an even larger aphasic population, 
and a different statistical tool, similarly re- 
jects historical a priori classifications. We 
trust the ghosts of classical analyses will be 
laid for all time. The congruence of the two 
studies in this respect is extremely convincing. 


*See pages 220-232 this issue: Dimensions 
of Language Performance in Aphasia, Lyle 
V. Jones and Joseph M. Wepman. 


Volume 4, No. 3 


295 


The Jones-Wepman discussion, however, 
has introduced a new issue: a general factor 
of language deficit versus multiple factors. It 
is implied that we support a general factor 
of deficit and determination of a single 
quantitative subject-score, while Jones and 
Wepman recognize more dimensions and 
argue for multiple scores and more detailed 
analysis. This is reminiscent of the older 
controversy concerning the structure of in- 
telligence, with advocates of a ‘general’ factor 
on one side, and advocates of ‘group’ factors 
on the other. With the fine acuity of hind- 
sight, it is now possible to see that both sides 
were ‘right. Virtually all factor analysts 
now recognize group factors, and almost all 
analysts expect to find some level of a gen- 
eral, or second-order, factor in analysis of 
a broad range of intellectual activities. Both 
factorially-pure test batteries and factorially- 
complex general tests have been found useful, 
depending on the purpose of the psy- 
chometrics. General tests are normally used 
successfully for populations with a wide 
range of intellectual abilities. We predict 
that something of the same sort will be found 
true in the current apparent disagreement, 
and, with this in mind, propose to examine 
some of the differences between the two 
studies, in an effort to examine the sources 
of disagreement and to clarify our position 
concerning language deficit in aphasia. 


Test Batteries 


An important difference between the two 
studies is certainly, as Jones and Wepman 
suggest, differences between the test batteries 
from which the data were obtained. The 
Wepman battery was constructed to sample 


September 1961 








296 Journal of Speech and Hearing Research 


selected stimulus-response relationships at a 
uniform level of difficulty: it was considered 
that items should be passed by half and 
failed by half the subjects. Such a battery 
should be most sensitive to covariations of 
different kinds of tests, but, on the other 
hand, places limitations on the kinds of 
tests used, the population studied, or both, 
in order to maintain a specified difficulty 
level. 

The Minnesota Test, on the other hand, 
was a research battery, constructed and re- 
vised over a period of years, to study the 
breakdown of language processes in aphasia 
as comprehensively as possible, and to exam- 
ine changes which occurred with recovery 
or regression. In general, items were retained 
which appeared to discriminate between 
aphasic and nonaphasic populations, between 
segments of aphasic populations, and between 
performances of subjects at various stages of 
recovery. 

Significant differences between the batteries 
resulted from these points of view. In the 
Minnesota battery, when two tests appeared 
redundant (i.e., measured the same function, 
were of the same difficulty level, and, in the 
main, passed and failed the same patients), 
one of the two was dropped to make room 
for new material. An example of this was 
two language tests: one, free word associa- 
tion, and the other, simple sentence comple- 
tion. It seems obvious now that the same 
processes were sampled in both tests, but 
once it was not so apparent. The free word 
association test was dropped and the sentence 
completion test retained because instructions 
were easier for aphasic subjects to follow, 
and administration time was shorter for the 
latter than for the former. Elimination of 
redundant tests tends to minimize group 
factors. On the other hand, in order to sam- 

le breakdowns of performance at higher 
Lorde, the Minnesota battery includes serial 
responses (counting to 20, giving days of the 
week, months of the year) in the absence 
of specific cues, and includes tests requiring 
formulation of oral and written responses 
without specific cues (answering simple ques- 
tions, giving autobiographical information, 
defining low-frequency words, expressing 
ideas, explaining similarities, writing sentences 
to use given words, and writing a paragraph 
to describe a picture). There are no such 
tests in the Wepman battery, and while these 
tests are generally difficult for aphasic sub- 
jects, they are considered important in wide- 
range appraisal of language deficit. The lack 
of such tests in the Wepman battery tends to 
minimize general factors, 


Patient Populations 


Subjects in the Jones-Wepman test were 
selected by means of an interview and by use 
of a screening test which required perform- 
ance at a given level. The Minnesota pop- 
ulation was selected only to the extent that 
subjects who responded to no test items, 
subjects who missed no test items (and so 
were not diagnosed as aphasic), patients not 
neurologically stable, and patients with a 
medically-confirmed diagnosis of psychosis, 
were excluded. The wide range of the Minne- 
sota battery permitted inclusion of subjects 
at both ends of the continuum who would 
presumably have been excluded by Wepman 
criteria. 


Method of Analysis 


Jones and Wepman consider the Guttman 
scale analysis technique an inappropriate test 
for a general factor. This is not the place 
for defense of the Guttman technique, nor 
are we qualified for such a task. The follow- 
ing explanation of the background of the 
study and choice of method is pertinent, 
however. 

In work with severely impaired aphasic 
patients, who are without functional speech, 
reading, or writing, certain regularities were 
observed during recovery. With a given set 
of materials, auditory recognition (pointing 
to object named by clinician) usually pre- 
ceded repetition; next the word could be 
elicited by a well-established association or 
a ‘question; this, in turn, tended to precede 
naming. Not until considerable basic vocabu- 
lary was available was it possible for patients 
to express longer or more complex ideas. 
It was further observed that gains tended 
to cross modalities at all levels. It was to 
test the validity of these clinical observations, 
that is, to ascertain if any such hierarchical 
order could be demonstrated, that scaling 
of tests was undertaken. 

Previous analyses of obtained findings for 
successive samples of patients, tested on 
various forms of the Minnesota Test, had 
shown that characteristic distributions of 
errors were not normal curves but rather 
were U-shaped or J-shaped. Tests at one 
end of the continuum tended to be passed 
by most patients, tests at the opposite end 
failed, while intermediate tests tended to sepa- 
rate patients into two readily discriminable 
groups (passing or failing) of varying propor- 
tions. This seemed to indicate that most tests 
were sets of homogeneous items cutting 
through a continuum (or continua) of deficit 








Schuell, Jenkins: Letter to the Editor 297 


at a particular level. Phi coefficients and Gutt- 
man scaling were preferred to usual product- 
moment correlation techniques because the 
latter are based on the assumption of a 
normal distribution, a criterion these data 
did not meet. 

To criticize the Guttman technique on the 
grounds that a scalable hierarchy may not be 
a general factor is, of course, admissible. Con- 
cerning this point, we commented as fol- 
lows (3): 


‘The question of factorial purity of the 
dimension discussed here may well be 
raised. It is conceivable that the dimension 
may be factorially complex when viewed 
in the context of many other tests. At 
the present time the authors are of the 
opinion that this question can only be 
decided by further empirical studies. The 
work of Gage on opinion poll data (1947) 
indicates that at least under some cir- 
cumstances the Guttman technique may 
isolate the items in a scale which are 
identical with those items appearing on 
the major factor in a factor analysis. 
Whether this is true of the present find- 
ings is a point to be resolved by future 
work,’ 


Rejection of the Guttman technique on the 
basis of Carroll’s hypothetical demonstration 
(Interim Conference, Symposium on Aphasia, 
Social Science Research Council, Boston, 
1958) is to beg the question and ignore the 
nature of the hypothetical tests used in the 
Carroll demonstration. The difficulty levels 
Carroll assumed forced these tests to scale, 
as is shown by computation of the minimal 
reproducibility of the assumed test data. 
Minimal reproducibility indices presented in 
our data (3) show that the latter data were 
not similarly ‘forced.’ 

Selection of tests included in the ahiysis 
was largely determined by the hypothesis to 
be examined. Tests were selected from Forms 
4, 5, and 6 of the Minnesota battery. Tests 
were necessarily eliminated from considera- 
tion which were not common to all three 
forms of the test. In addition, tests which 
were not primarily language tests (matching 
pictures, colors, forms; phonation, imitations 
of movements of speech musculatures; copy- 
ing and drawing) and tests directly designed 
to measure other skills (numerical relations, 
arithmetic processes, spatial orientation, and 
body image) were excluded from the analysis 
as having little bearing on the language hy- 
pothesis, 


Of the remaining tests, some were ob- 
viously more reliable than others; tests 
requiring true-false discriminations, for ex- 
ample, are more susceptible to guessing than 
certain other tests, but were retained to 
sample responses of subjects who could not 
speak or write. It is not necessary or desirable 
in a comprehensive battery that all tests 
should scale, nor do high phi coefficients 
assure scalability or reproducibility. Briefly, 
the tests sampled the various domains of 
language function which classical categories 
would have segregated, and yet, surprisingly, 
the tests were scalable. 


The final list of ‘difficulties’ cited critically 
by Jones and Wepman is a little puzzling. 
Certainly inclusion of the homogeneous and 
uniformly low-scoring Group 1 subjects 
would contribute to reproducibility. This 
was explicitly stated, and analysis was re- 
peated with patients in other diagnostic 
categories, both separately and combined, 
with Group 1 subjects omitted. These data 
were included in our article, and showed 
reproducibility of scaling essentially un- 
changed. 


Distribution of test difficulty has already 
been commented on in relation to contruction 
of the tests. Jones and Wepman suggest that 
a wide range of difficulty indicates the tests 
must have been discriminating poorly. This 
is completely untrue. In the first place, a 
wide range of difficulty is essential in this 
kind of scale, and, in the second place, no 
distribution discriminates more sharply than 
a U-shaped distribution at its optimal level. 


The next question was why detailed scores 
were not utilized, instead of pass-fail treat- 
ment of subtests. It was considered that use 
of a range of possible scores for each test, 
to establish scalability of language deficit, 
was open to criticism on the grounds that 
manipulation of cutting scores would increase 
the likelihood that some cutting point might 
be found which by chance would permit 
the test to scale. Such scaling would of 
course be spurious to some unknown degree. 
It was therefore decided to test scalability 
by setting an arbitrary cutting point. That 
the tests scaled, not only under this handicap 
but also over all diagnostic categories in- 
cluded, is remarkable evidence for the per- 
vasive nature of the dimension identified. 
As was pointed out later in our study, ad- 
justing cutting points for tests previously 
found scalable on a pass-fail basis further 
improved reproducibility of the scale. This 
finding indicates that we did not fully exploit 
the sensitivity of the scores in our first 





298 Journal of Speech and Hearing Research 


analysis, however, we preferred to establish 
the reproducibility of the scale before manip- 
ulating scoring procedures. 

One further point needs to be restated, 
lest readers conclude, from the hypothesis of 
a unitary language deficit crossing modalities, 
that we advocate measuring that deficit alone. 
The senior author has pioneered in the 
development of a classification system (1, 2) 
which repeatedly has proved valuable in 
diagnosis, prognosis, and treatment of aphasic 
patients. This system is based on observed 
patterns of impairment which are differenti- 
ated by the presence or absence of specific 
motor or perceptual defects which are found 
in some subjects and not in others. In ‘An 
Investigation of the Nature of the Language 
Deficit in Aphasia, this was restated (3): 


‘The reader should not assume that the 
tests in any of the scales given above are 
sufficient to form a test battery for di- 
agnosis and prognosis in aphasia. It ob- 
viously is important to have tests with 
very high visual requirements, to test for 
motor malfunctions and hearing loss, and 
to investigate the patients’ functioning on 
tests of mixed abilities which are im- 
portant in everyday life (for example, 
spatial orientation and mathematical com- 
petence). 

‘It should be re-emphasized here that 
the classification system described earlier 
in the paper is actually based on the pres- 
ence or absence of specific motor or 
perceptual deficits which may or may not 
co-exist with the basic language deficit 
found in all the clinical groups described, 
and that while prognosis appears to be 
related to the over-all pattern of injury 
incurred in the brain, the recovery of 
language, when it occurs, progresses sys- 
tematically and relatively infependently 
of the superimposed defects, as indicated 
by the congruence of clinically observed 
recovery patterns and ordering of tests 
obtained on the clinical scale.’ 


Discussion 


Turning from discussion of the previous 
study, we should like to comment on the 
Jones-Wepman study itself. We find support 
for our hypothesis in their analysis. The ‘size- 
able correlations among factors,’ which lead 
to difficulties of interpretation in their study, 
are, we consider, a reflection of general 
language reduction across modalities. That 
reading (Factor A) is somewhat different 


from repetition (Factor B) supports the rea- 
sonable view that the tasks are differentially 
affected by visual deficits, and that duration 
or continued availability of the stimulus 
plays a role in determining the probability of 
the correct response for patients who can 
utilize such stimuli. 

In a study currently in press (4), we have 
reported research on a test involving pointing 
to pictures of objects named. The test was 
first presented with auditory stimuli alone 
and repeated with combined auditory and 
visual stimuli. The presentations correlated 
highiy (+.80), with the second task prov- 
ing easier, as expected. Departures from per- 
fect agreement in ordering of subjects were 
in the main attributable to the fact that sub- 
jects in Groups 3 and 5, who showed cere- 
bral involvement of visual processes, did not 
benefit as much as other subjects from added 
visual cues. 


We are pleased that ‘copying’ (Factor C), 
and ‘arithmetic’ (Factor F) fall out as rela- 
tively independent of the language function; 
this is in agreement with our rationale for 
excluding such tests from an inquiry into 
general language deficit. We regard ‘writing 
to dictation,’ the. prominent tasks on Factor 
D, as a complex of visual and visuomotor 
skills and auditory recall; hence we should 
have expected these tests to load on language 
factors as they do. 

We cannot yet say whether the Jones and 
Wepman ‘comprehension of language sym- 
bols’ (Factor E) is most closely allied with 
the general language deficit which we see in 
aphasia. It seems probable that it is, judging 
from the nature of many of the tests found 
scalable in our study, but the heavy reliance 
of this factor on picture matching as a re- 
sponse modality raises some questions. 

We do not expect residual differences of 
opinion to be resolved by this discussion. 
What is obviously required is further search- 
ing, critical work. There is now in  aneniger a 
factor analysis of data obtained from the 
73 tests on Form 6 of the Minnesota battery 
administered to 157 aphasic subjects. We 
trust this will help us integrate our findings 
more closely with those of Jones and Wep- 
man and enable us to move another step 
forward in the precise description of aphasia. 

Finally, to avoid misunderstanding, we 
should like to recapitulate the conclusions 
stated in our earlier study. We believe that 
classical divisions of aphasia into sensory and 
motor dichotomies and isolated pure defects 
are in error; we consider current evidence 














we OO eV 








Schuell, Jenkins: Letter to the Editor 299 


compatible with the hypothesis of an under- 
lying general language deficit crossing modali- 
ties. We do not believe either identification 
or assessment of this general language deficit 
constitutes adequate information about any 
aphasic patient. We strongly advocate a com- 
prehensive battery of tests, over a wide 
range of difficulty levels, to evaluate per- 
ceptual and motor disabilities and the effects 
of these deficits on various language per- 
formances; we consider this information 
essential for diagnosis, prognosis, and treat- 
ment of aphasia. 

We realistically expect that, over a large 
sample of aphasic patients, many dimensions 
of impairment resulting from brain damage 
are identifiable, and need to be studied, in 
addition to the common or general dimension 
of language deficit. We further expect that, 
at a given level of language deficit, language 
tests may be arranged in subgroups which 
show systematic regularities in aphasic per- 
formance in various modalities as well as 
systematic differences in the performance of 
various segments of aphasic populations. We 
do not believe this is inconflict with the 


notion of a general dimension of language 
deficit present in aphasia. 

We trust finally that this will be a rich 
and exciting field of further inquiry. 


Hildred Schuell, Director 

Aphasia Section, Neurology Service 

Minneapolis Veterans Administration 
Hospital 

Associate Professor of Neurology 

University of Minnesota Medical School 


James J. Jenkins 
Professor of Psychology 
University of Minnesota 


References 


1. Scuuett, Hiprep, Diagnosis and prog- 
nosis in aphasia. Arch. Neurol. Psychiat., 
74, 1955, 308-315. 

2. Scuuett, Hitprep, A short examination for 
aphasia. Neurology, 7, 1957, 625-634. 

3. ScHueLL, Hitprep, and Jenkins, J. J., The 
nature of language deficit in aphasia. 
Psychol. Rev., 66, 1959, 45-67. 

4. ScHuett, Hitprep, and Jenkins, J. J., Re- 
duction of vocabulary in aphasia. Brain, 
in press. 





